Image processing device and method

ABSTRACT

The present technology relates to image processing device and method that achieve a reduction in a buffer size.The image processing device partitions a unit of processing into partitioned processing units each of which corresponds to a VPDU size or is equal to or smaller than the VPDU size, the unit of processing being used for calculation of a cost that is used for determining whether or not to perform bidirectional prediction. The image processing device makes the determination by using the cost calculated based on the partitioned processing units. The present technology is applicable to encoding devices or decoding devices.

TECHNICAL FIELD

The present technology relates to image processing device and method, inparticular, to image processing device and method that achieve areduction in a buffer size.

BACKGROUND ART

The VVC standard that is a next-generation codec has been developed as asuccessor to AVC/H.264 and HEVC/H.265.

In the VVC standard in which large CUs (Coding Units) up to 128×128 areemployed, the concept of VPDUs (Virtual Pipeline Data Units) has alsobeen introduced in consideration of increases in circuit scale and powerconsumption in HW decoder implementation, in particular.

The VPDU size is a buffer size that allows smooth processing on eachpipeline stage. The VPDU size is often set to the maximum size of TUs(Transform Units).

In VVC, the maximum TU size is 64×64, and the same is assumed to holdtrue for VPDUs. In VVC, one CU corresponds to one PU, and hence interprediction processing is required to be performed on PUs larger thanVPDUs. Even in this case, the PU can be partitioned into virtual vPUs(virtual PUs) to be processed. VVC is consistent with VPDUs and has beenable to be implemented with reasonable HW resources until BIO(Bi-directional optical flow) described later has been employed.

The optical flow method is an image processing method for detecting themotion of an object in a moving image, to thereby estimate a directionin which the object is to move in a certain period of time. Codec interprediction employing the optical flow method as an option enhances theencoding efficiency. The term “BIO” is based on the fact that theoptical flow method is used in Bi prediction (bidirectional prediction)in which temporally continuous frames are referred to in units of frames(see NPL 1).

In normal Bi prediction, difference MVs (MVDs) are encoded since thereare differences between optimal MVs and predicted MVs (PMVs). In Biprediction employing BIO, on the other hand, a result equivalent to thatin normal Bi prediction is obtained as follows: a gradient (G) and avelocity (V) are obtained by the optical flow method for predictionblocks generated with predicted MVs (PMVs). In such a case, the encodingof difference MVs (MVDs) can be unnecessary or eliminated so that theencoding efficiency is enhanced (see NPL 2).

Meanwhile, the calculation costs of the gradient (G) and the velocity(V), which are obtained in BIO, are very high. Thus, a reduction isparticularly required in terms of cost-effectiveness in a case where, asa result of the calculation of the gradient (G) and the velocity (V),there is almost no difference from prediction values obtained by normalBi prediction due to small absolute values, for example.

Various reduction methods in terms of BIO have been proposed. In one ofthe reduction methods, the SAD (Sum of Absolute Difference) of an L0prediction block and an L1 prediction block is calculated when theblocks are generated, and BIO is not applied and normal Bi prediction isexecuted in a case where the SAD value falls below a certain threshold.

This is based on a tendency that the velocity (V) is small and BIO isthus not very effective when the SAD value is small, and achieves earlytermination, that is, eliminates the high cost calculation in a casewhere the effect is not expected.

CITATION LIST Non Patent Literature

-   [NPL 1]

Jianle Chen, Yan Ye, Seung Hwan Kim, “Algorithm description forVersatile Video Coding and Test Model 3 (VTM 3),” [online], Sep. 24,2018, Experts Team (JVET), [retrieved on Dec. 21, 2018], Internet,<http://phenix.it-sudparis.eu/jvet/doc_end_user/documents/12_Macao/wg11/JVET-L1002-v1.zip>

-   [NPL 2]

Xiaoyu Xiu, Yuwen He, Yan Ye, “CE9-related: Complexity reduction andbit-width control for bi-directional optical flow (BIO),” [online], Sep.24, 2018, Experts Team (JVET), [retrieved on Dec. 21, 2018], Internet,<http://phenix.it-sudparis.eu/jvet/doc_end_user/documents/12_Macao/wg11/JVET-L0256-v3.zip>

SUMMARY Technical Problem

In a case where the reduction method in terms of BIO described above isapplied, the SAD of L0 and L1 prediction blocks is calculated for anentire PU to be compared to the threshold, thereby determining whetheror not to apply BIO processing, and the processing then branches. Thus,it is difficult to virtually partition, in a case where inter predictionis performed on PUs larger than VPDUs, the PU into a plurality of vPUs.

In this case, as a buffer necessary for gradient calculation or velocitycalculation, a region slightly larger than the PU is required, with theresult that a BIO-included inter prediction processing unit requires alarge buffer resource.

The present technology has been made in view of such circumstances, andachieves a reduction in a buffer size.

Solution to Problem

According to an aspect of the present technology, there is provided animage processing device including a control unit configured to partitiona unit of processing into partitioned processing units each of whichcorresponds to a VPDU size or is equal to or smaller than the VPDU size,the unit of processing being used for calculation of a cost that is usedfor determining whether or not to perform bidirectional prediction; anda determination unit configured to make the determination by using thecost calculated based on the partitioned processing units.

According to an aspect of the present technology, a unit of processingis partitioned into partitioned processing units each of whichcorresponds to a VPDU size or is equal to or smaller than the VPDU size,the unit of processing being used for calculation of a cost that is usedfor determining whether or not to perform bidirectional prediction, andthe determination is made by using the cost calculated based on thepartitioned processing units.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram illustrating an example in which a pipeline isstructured without the introduction of VPDUs.

FIG. 2 is a flowchart illustrating Bi prediction that is one of inter PUprocessing in the case of FIG. 1.

FIG. 3 is a diagram illustrating an example in which a pipeline isefficiently structured with the introduction of VPDUs.

FIG. 4 is a flowchart illustrating Bi prediction that is one of inter PUprocessing in the case of FIG. 3.

FIG. 5 is a diagram illustrating exemplary normal Bi prediction.

FIG. 6 is a diagram illustrating exemplary Bi prediction employing BIO.

FIG. 7 is a diagram illustrating exemplary 2-block partition in normalBi prediction.

FIG. 8 is a diagram illustrating exemplary 2-block partition in Biprediction employing BIO.

FIG. 9 is a block diagram illustrating a configuration example of anencoding device according to an embodiment of the present technology.

FIG. 10 is a flowchart illustrating details of encoding processing bythe encoding device.

FIG. 11 is a flowchart illustrating the details of the encodingprocessing by the encoding device, which is a continuation of FIG. 10.

FIG. 12 is a block diagram illustrating a configuration example of anembodiment of a decoding device to which the present disclosure isapplied.

FIG. 13 is a flowchart illustrating details of decoding processing bythe decoding device.

FIG. 14 is a block diagram illustrating a configuration example of aninter prediction unit.

FIG. 15 is a flowchart illustrating related-art BIO-included Biprediction.

FIG. 16 is a flowchart illustrating the related-art BIO-included Biprediction, which is a continuation of FIG. 15.

FIG. 17 is a flowchart illustrating BIO-included Bi prediction accordingto a first embodiment of the present technology.

FIG. 18 is a flowchart illustrating the BIO-included Bi predictionaccording to the first embodiment of the present technology, which is acontinuation of FIG. 17.

FIG. 19 is a diagram illustrating correspondences between PU size, vPUnumber, and processing position and size.

FIG. 20 is a diagram illustrating comparisons between related-artoperation and operation according to the first embodiment of the presenttechnology.

FIG. 21 is a diagram illustrating comparisons between the related-artoperation and the operation according to the first embodiment of thepresent technology.

FIG. 22 is a diagram illustrating an example in which in a case wherePUs are larger than VPDUs, a BIO determination result for a vPU numberof 0 is also used for another vPU.

FIG. 23 is a diagram illustrating an example in which in a case wherePUs are larger than VPDUs, a BIO determination result for the vPU numberof 0 is also used for the other vPU.

FIG. 24 is a flowchart illustrating BIO-included Bi prediction in thecases of FIG. 22 and FIG. 23.

FIG. 25 is a flowchart illustrating the BIO-included Bi prediction inthe cases of FIG. 22 and FIG. 23, which is a continuation of FIG. 24.

FIG. 26 is a diagram illustrating an example in which whether to applyBIO is determined with a partial SAD value in each vPU.

FIG. 27 is another diagram illustrating an example in which whether toapply BIO is determined with a partial SAD value in each vPU.

FIG. 28 is a flowchart illustrating the processing of determining apartial SAD calculation region for determining BIO_vPU_ON in each vPU.

FIG. 29 is a flowchart illustrating the processing of determining apartial SAD calculation region for determining BIO_vPU_ON in each vPU,which is a continuation of FIG. 28.

FIG. 30 is a flowchart illustrating, as an operation example accordingto a second embodiment of the present technology, BIO-included Biprediction that is performed by an inter prediction unit 51.

FIG. 31 is a flowchart illustrating, as the operation example accordingto the second embodiment of the present technology, the BIO-included Biprediction that is performed by the inter prediction unit 51, which is acontinuation of FIG. 30.

FIG. 32 is a diagram illustrating correspondence betweenBIO_MAX_SAD_BLOCK_SIZE and sPU.

FIG. 33 is a flowchart illustrating, as an operation example accordingto a third embodiment of the present technology, BIO-included Biprediction that is performed by the inter prediction unit 51.

FIG. 34 is a flowchart illustrating, as the operation example accordingto the third embodiment of the present technology, the BIO-included Biprediction that is performed by the inter prediction unit 51, which is acontinuation of FIG. 33.

FIG. 35 is a diagram illustrating exemplary regions for calculating SADsin each PU in a case where BIO_MAX_SAD_BLOCK_SIZE is 2.

FIG. 36 is another diagram illustrating exemplary regions forcalculating SADs in each PU in the case where BIO_MAX_SAD_BLOCK_SIZE is2.

FIG. 37 is a flowchart illustrating, as an operation example accordingto a fourth embodiment of the present technology, BIO-included Biprediction that is performed by the inter prediction unit 51.

FIG. 38 is a flowchart illustrating, as the operation example accordingto the fourth embodiment of the present technology, the BIO-included Biprediction that is performed by the inter prediction unit 51, which is acontinuation of FIG. 37.

FIG. 39 is a flowchart illustrating, as an operation example accordingto a fifth embodiment of the present technology, BIO-included Biprediction that is performed by the inter prediction unit 51.

FIG. 40 is a flowchart illustrating, as the operation example accordingto the fifth embodiment of the present technology, the BIO-included Biprediction that is performed by the inter prediction unit 51, which is acontinuation of FIG. 39.

FIG. 41 is a block diagram illustrating a configuration example of acomputer.

DESCRIPTION OF EMBODIMENTS

Now, modes for carrying out the present technology are described. Thefollowing items are described in order.

0. Outline

1. First Embodiment (Exemplary Partition with vPUs)

2. Second Embodiment (Exemplary Operation Sharing with Flags)

3. Third Embodiment (Exemplary Partition with sPUs)

4. Fourth Embodiment (Example in which Use of BIO Is Prohibited)

5. Fifth Embodiment (Example in which BIO Is Always Applied)

6. Sixth Embodiment (Computer)

<0. Outline>

The VVC standard that is a next-generation codec has been developed as asuccessor to AVC/H.264 and HEVC/H.265.

In the VVC standard in which large CUs (Coding Units) up to 128×128 areemployed, the concept of VPDUs (Virtual Pipeline Data Units) has alsobeen introduced in consideration of increases in circuit scale and powerconsumption in HW decoder implementation, in particular.

The VPDU size is a buffer size that allows smooth processing on eachpipeline stage. The VPDU size is often set to the maximum size of TUs(Transform Units).

In VVC, the maximum TU size is 64×64, and the same is assumed to holdtrue for VPDUs. In VVC, one CU corresponds to one PU, and hence interprediction processing is required to be performed on PUs larger thanVPDUs. Even in this case, the PU can be partitioned into virtual vPUs(virtual PUs) to be processed. VVC is consistent with VPDUs and has beenable to be implemented with reasonable HW resources since only smallbuffers are used as illustrated in FIG. 1 to FIG. 4 until BIO(Bi-directional optical flow) described later has been employed.

<Exemplary Pipeline without Introduction of VPDUs>

FIG. 1 is a diagram illustrating an example in which a pipeline isstructured without the introduction of VPDUs.

In the upper part of FIG. 1, the blocks of a CU, an inter PU, and a TUare illustrated.

The maximum CU size is 128×128. The maximum inter PU size is 128×128. InVVC, one CU corresponds to one PU. The TU includes a TU0 to a TU3, themaximum size of each TU is 64×64. The TU size is the VPDU size.

As illustrated in the upper part of FIG. 1, the CU is obtained by addingthe inter PU generated by inter PU processing and the TU obtained by TUprocessing together.

In the lower part of FIG. 1, the pipeline including inter PU processing,TU processing, and local decoding processing is illustrated.

In the pipeline, the inter PU processing and the processing on the TU0to the TU3 are performed in parallel, and the local decoding processingon the CU starts when both the processing processes are complete. Thus,the inter PU processing requires a buffer of 128×128, and the TUprocessing requires a buffer of 128×128 to meet the PU.

FIG. 2 is a flowchart illustrating Bi prediction (bidirectionalprediction) that is one of the inter PU processing in the case of FIG.1.

In Step S1, inter prediction parameters are acquired.

In Step S2, an L0 prediction block is generated.

In Step S3, an L1 prediction block is generated.

In Step S4, a Bi prediction block PU is generated from the L0 predictionblock and the L1 prediction block.

Note that, in Steps S2 to S4, the PU size is required as the maximumbuffer size.

<Exemplary Pipeline with Introduction of VPDUs>

FIG. 3 is a diagram illustrating an example in which a pipeline isefficiently structured with the introduction of VPDUs.

Note that, in FIG. 3, points common to those in the description of FIG.1 are appropriately omitted.

In the upper part of FIG. 3, the blocks of a CU, an inter PU, and a TUare illustrated. The CU includes, unlike FIG. 1, divisions CU(0) toCU(3) since the PU is virtually partitioned into vPUs to be processed.The PU includes virtual vPU(0) to vPU(3).

In the lower part of FIG. 3, the pipeline including inter PU processing,TU processing, and local decoding processing is illustrated.

In the pipeline, the processing on the vPU(0) to the vPU(3) in the interPU and the processing on the TU0 to the TU3 are performed in parallel.Thus, when the processing on the vPU(0) and the processing on the TU0are complete, the local decoding processing on the CU(0) starts. Whenthe processing on the vPU(1) and the processing on the TU1 are complete,the local decoding processing on the CU(1) starts. When the processingon the vPU(2) and the processing on the TU2 are complete, the localdecoding processing on the CU(2) starts. When the processing on thevPU(3) and the processing on the TU3 are complete, the local decodingprocessing on the CU(3) starts.

With the pipeline structured in such a way, a buffer of 64×64 is enoughin the inter PU processing, and in the TU processing, a buffer having asize of 64×64 is enough to meet the vPU.

FIG. 4 is a flowchart illustrating Bi prediction that is one of theinter PU processing in the case of FIG. 3.

In Step S11, inter prediction parameters are acquired.

In Step S12, the number of vPUs included in the PU is acquired.

In Step S13, 0 is set to the vPU number.

In Step S14, it is determined whether or not the vPU number is smallerthan the number of vPUs. In a case where it is determined in Step S14that the vPU number is smaller than the number of vPUs, the processingproceeds to Step S15.

In Step S15, the position and size of the vPU in the PU are acquiredfrom the vPU number.

In Step S16, an L0 prediction block in the vPU region is generated.

In Step S17, an L1 prediction block in the vPU region is generated.

In Step S18, a Bi prediction block vPU is generated from the L0prediction block and the L1 prediction block.

In Step S19, the vPU number is incremented. After that, the processingreturns to Step S14, and the later processing is repeated.

Further, in a case where it is determined in Step S14 that the vPUnumber is equal to or larger than the number of vPUs, the Bi predictionends.

Note that, in Steps S16 to S17, the VPDU size smaller than the PU sizeis enough for the maximum buffer size.

The optical flow method is an image processing method for detecting themotion of an object in a moving image, to thereby estimate a directionin which the object is to move in a certain period of time. Codec interprediction employing the optical flow method as an option enhances theencoding efficiency. The term “BIO” is based on the fact that theoptical flow method is used in Bi prediction in which temporallycontinuous frames are referred to in units of frames.

<Exemplary Normal Bi Prediction>

FIG. 5 is a diagram illustrating exemplary normal Bi prediction.

In FIG. 5, the arrow extending from the left to the right representstime in the display order. Further, FIG. 5 illustrates an example inwhich optimal MVs on a reference plane 0 in an L0 direction and areference plane 1 in an L1 direction are obtained for the Bi predictionvalue of a Bi prediction block on a picture B. The same holds true forthe following figures.

The Bi prediction value corresponds to a pixel L0 of an L0 predictionblock on the reference plane 0 and a pixel L1 of an L1 prediction blockon the reference plane 1, and the Bi prediction value is thus obtainedfrom (L0+L1)/2.

As illustrated in FIG. 5, in the normal Bi prediction, optimal MVs(MV_L0 and MV_L1) are different from predicted MVs (MVP_L0 and MVP_L1),and hence the encoding of difference MVs (MVD_L0 and MVD_L1) isnecessary.

<Exemplary Bi Prediction Employing BIO>

FIG. 6 is a diagram illustrating exemplary Bi prediction employing BIO.

FIG. 6 illustrates, as the Bi prediction employing BIO, an example inwhich a gradient (G) and a velocity (V) are obtained by the optical flowmethod for prediction blocks generated with the predicted MVs (MVP_L0and MVP_L1). The gradient (G) and the velocity (V) are obtained by theoptical flow method for the prediction blocks so that a resultequivalent to that in the normal Bi prediction is obtained.

In the case of the Bi prediction employing BIO, the predicted MVs(MVP_L0 and MVP_L1) are directly used as the MVs (MV_L0 and MV_L1), andhence the encoding of the difference MVs (MVD_L0 and MVD_L1) isunnecessary, which means that the encoding efficiency is enhanced.

The Bi prediction value almost corresponds to a pixel L0′ of the L0prediction block on the reference plane 0 and a pixel L1′ of the L1prediction block on the reference plane 1, and the Bi prediction valueis thus obtained from (L0′+L1′+B)/2. That is, the gradients (G: Gx andGy) and the velocities (V: Vx and Vy) are required to be calculated fromthe L0 prediction block and the L1 prediction block, thereby obtaining acorrection value B=Vx*Gx+Vy*Gy.

<Exemplary 2-block Partition in Bi Prediction>

FIG. 7 is a diagram illustrating exemplary two-block partition in thenormal Bi prediction.

In the normal Bi prediction, there are two blocks so that, asillustrated in FIG. 7, block partition information regarding the twoblocks and two difference MVs (MVDs) are obtained. Thus, the encoding ofthe block partition information regarding the two blocks and the twodifference MVs (MVDs) is necessary.

<Exemplary 2-Block Partition in Bi Prediction Employing BIO>

FIG. 8 is a diagram illustrating exemplary 2-block partition in the Biprediction employing BIO.

In the Bi prediction employing BIO, even when there are two blocks, asillustrated in FIG. 8, the gradient (G) and the velocity (V) areobtained by the optical flow method without partitioning the blocks sothat a result equivalent to that in the normal Bi prediction isobtained.

As described above, in the Bi prediction employing BIO of FIG. 8, theencoding of block partition information, which is necessary in the Biprediction of FIG. 7, can be unnecessary or eliminated, and the encodingof difference MVs (MVDs), which is necessary in the Bi prediction ofFIG. 7, can be unnecessary or eliminated, with the result that theencoding efficiency can be enhanced.

Meanwhile, the calculation costs of the gradient (G) and the velocity(V), which are obtained in BIO, is very high. Thus, a reduction isparticularly required in terms of cost-effectiveness in a case where, asa result of the calculation of the gradient (G) and the velocity (V),there is almost no difference from prediction values obtained by normalBi prediction due to small absolute values, for example.

Various reduction methods in terms of BIO have been proposed. In one ofthe reduction methods, the SAD (Sum of Absolute Difference) of an L0prediction block and an L1 prediction block is calculated when theblocks are generated, and BIO is not applied and normal Bi prediction isexecuted in a case where the SAD value falls below a certain threshold.

This is based on a tendency that the velocity (V) is small and BIO isthus not very effective when the SAD value is small, and achieves earlytermination, that is, eliminates the high cost calculation in a casewhere the effect is not expected.

In a case where the reduction method in terms of BIO described above isapplied, the SAD of L0 and L1 prediction blocks is calculated for anentire PU to be compared to the threshold, thereby determining whetheror not to apply BIO processing, and the processing then branches. Thus,it is difficult to virtually partition, in a case where inter predictionis performed on PUs larger than VPDUs, the PU into a plurality of vPUs.

In this case, as a buffer necessary for gradient calculation or velocitycalculation, a region slightly larger than the PU is required, with theresult that a BIO-included inter prediction processing unit requires alarge buffer resource.

Further, in a case where the reduction in terms of BIO is implemented byHW, due to a large difference between the pipeline delay of BIO-includedinter prediction and the pipeline delay of TU processing, HWimplementation that maintains throughput is difficult to achieve.

Accordingly, in the present technology, a unit of processing incalculation of a cost that is used for determining whether or not toperform bidirectional prediction such as BIO (for example, PU) ispartitioned into partitioned processing units each of which correspondsto the VPDU size (for example, vPU) or is equal to or smaller than theVPDU size (for example, sPU described later), and the determination ismade by using the cost calculated on the basis of the partitionedprocessing units. Note that, the size corresponding to the VPDU sizemeans a size slightly larger than the VPDU size.

Note that, herein, with regard to block size, “A is larger than B” means“the horizontal size of A is larger than the horizontal size of B” or“the vertical size of A is larger than the vertical size of B.”

Further, with regard to block size, “A is equal to or smaller than B”means “the horizontal size of A is equal to or smaller than thehorizontal size of B and the vertical size of A is equal to or smallerthan the vertical size of B.”

Now, the present technology is described in detail.

<1. First Embodiment (Exemplary Partition with vPUs)>

<Configuration Example of Encoding Device>

FIG. 9 is a block diagram illustrating a configuration example of anencoding device according to an embodiment of the present technology.

An encoding device 1 of FIG. 9 includes an A/D conversion unit 31, ascreen rearrangement buffer 32, a calculation unit 33, an orthogonaltransform unit 34, a quantization unit 35, a lossless encoding unit 36,an accumulation buffer 37, an inverse quantization unit 38, an inverseorthogonal transform unit 39, and an addition unit 40. Further, theencoding device 1 includes a deblocking filter 41, an adaptive offsetfilter 42, an adaptive loop filter 43, a frame memory 44, a switch 45,an intra prediction unit 46, a motion prediction/compensation unit 47, apredicted image selection unit 48, and a rate control unit 49.

The A/D conversion unit 31 performs A/D conversion on images in units offrames input to be encoded. The A/D conversion unit 31 outputs theimages that are now the digital signals after the conversion to thescreen rearrangement buffer 32 and stores the digital signals therein.

The screen rearrangement buffer 32 rearranges images in units of framesstored in a display order into an encoding order on the basis of the GOPstructure. The screen rearrangement buffer 32 outputs the rearrangedimages to the calculation unit 33, the intra prediction unit 46, and themotion prediction/compensation unit 47.

The calculation unit 33 subtracts predicted images supplied from thepredicted image selection unit 48 from images supplied from the screenrearrangement buffer 32, to thereby perform encoding. The calculationunit 33 outputs the images obtained as a result of the subtraction asresidual information (difference) to the orthogonal transform unit 34.Note that, in a case where no predicted image is supplied from thepredicted image selection unit 48, the calculation unit 33 directlyoutputs images read out from the screen rearrangement buffer 32 asresidual information to the orthogonal transform unit 34.

The orthogonal transform unit 34 performs orthogonal transformprocessing on residual information from the calculation unit 33. Theorthogonal transform unit 34 outputs the images obtained as a result ofthe orthogonal transform processing to the quantization unit 35.

The quantization unit 35 quantizes images obtained as a result oforthogonal transform processing supplied from the orthogonal transformunit 34. The quantization unit 35 outputs the quantized values obtainedas a result of the quantization to the lossless encoding unit 36.

The lossless encoding unit 36 acquires intra prediction mode informationthat is information indicating an optimal intra prediction mode from theintra prediction unit 46. Further, the lossless encoding unit 36acquires inter prediction mode information that is informationindicating an optimal inter prediction mode and inter predictionparameters such as motion information and reference image informationfrom the motion prediction/compensation unit 47.

Further, the lossless encoding unit 36 acquires offset filterinformation associated with an offset filter from the adaptive offsetfilter 42 and acquires filter coefficients from the adaptive loop filter43.

The lossless encoding unit 36 performs, on quantized values suppliedfrom the quantization unit 35, lossless encoding such as variable-lengthcoding (for example, CAVLC (Context-Adaptive Variable Length Coding)) orarithmetic coding (for example, CABAC (Context-Adaptive BinaryArithmetic Coding)).

Further, the lossless encoding unit 36 losslessly encodes, as encodinginformation associated with encoding, the intra prediction modeinformation or the inter prediction mode information, the interprediction parameters, the offset filter information, or the filtercoefficients. The lossless encoding unit 36 outputs the lossless-encodedencoding information and quantized values as encoded data to theaccumulation buffer 37 and accumulates the information and the quantizedvalues therein.

The accumulation buffer 37 temporarily stores encoded data supplied fromthe lossless encoding unit 36. Further, the accumulation buffer 37outputs the stored encoded data as encoded streams to the subsequentstage.

Further, the quantized values output from the quantization unit 35 arealso input to the inverse quantization unit 38. The inverse quantizationunit 38 inversely quantizes the quantized values, and outputs theorthogonal transform processing results obtained as a result of theinverse quantization to the inverse orthogonal transform unit 39.

The inverse orthogonal transform unit 39 performs inverse orthogonaltransform processing on orthogonal transform processing results suppliedfrom the inverse quantization unit 38. Examples of the inverseorthogonal transform include IDCT (inverse discrete cosine transform)and IDST (inverse discrete sine transform). The inverse orthogonaltransform unit 39 outputs the residual information obtained as a resultof the inverse orthogonal transform processing to the addition unit 40.

The addition unit 40 adds residual information supplied from the inverseorthogonal transform unit 39 and predicted images supplied from thepredicted image selection unit 48 together, to thereby perform decoding.The addition unit 40 outputs the decoded images to the deblocking filter41 and the frame memory 44.

The deblocking filter 41 performs deblocking filter processing ofeliminating block deformation on decoded images supplied from theaddition unit 40. The deblocking filter 41 outputs the images obtainedas a result of the deblocking filter processing to the adaptive offsetfilter 42.

The adaptive offset filter 42 performs adaptive offset filter (SAO(Sample adaptive offset)) processing of mainly eliminating ringing onimages obtained as a result of deblocking filter processing by thedeblocking filter 41.

The adaptive offset filter 42 outputs the images obtained as a result ofthe adaptive offset filter processing to the adaptive loop filter 43.Further, the adaptive offset filter 42 supplies, as offset filterinformation, information indicating the types of the adaptive offsetfilter processing and the offsets to the lossless encoding unit 36.

The adaptive loop filter 43 includes a two-dimensional Wiener filter,for example. The adaptive loop filter 43 performs adaptive loop filter(ALF) processing on images obtained as a result of adaptive offsetfilter processing.

The adaptive loop filter 43 outputs the images obtained as a result ofthe adaptive loop filter processing to the frame memory 44. Further, theadaptive loop filter 43 outputs the filter coefficients used in theadaptive loop filter processing to the lossless encoding unit 36.

The frame memory 44 accumulates images supplied from the adaptive loopfilter 43 and images supplied from the addition unit 40. Of the imagesaccumulated in the frame memory 44 without being subjected to the filterprocessing, images neighboring the CUs are output as peripheral imagesto the intra prediction unit 46 through the switch 45. Meanwhile, theimages subjected to the filter processing to be accumulated in the framememory 44 are output as reference images to the motionprediction/compensation unit 47 through the switch 45.

The intra prediction unit 46 performs intra prediction processing in allcandidate intra prediction modes in units of PUs by using peripheralimages read out from the frame memory 44 through the switch 45.

Further, the intra prediction unit 46 calculates RD costs in all thecandidate intra prediction modes on the basis of images read out fromthe screen rearrangement buffer 32 and predicted images generated by theintra prediction processing. The intra prediction unit 46 determines anintra prediction mode having the calculated RD cost that is minimum asan optimal intra prediction mode.

The intra prediction unit 46 outputs the predicted image generated inthe optimal intra prediction mode to the predicted image selection unit48. The intra prediction unit 46 outputs, when being notified that thepredicted image generated in the optimal intra prediction mode has beenselected, the intra prediction mode information to the lossless encodingunit 36. Note that, the intra prediction mode is a mode indicating PUsizes, prediction directions, and the like.

The motion prediction/compensation unit 47 performs motionprediction/compensation processing in all candidate inter predictionmodes. The motion prediction/compensation unit 47 includes an interprediction unit 51 configured to compensate for predicted motions togenerate predicted images.

The motion prediction/compensation unit 47 detects motion information(motion vectors) in all the candidate inter prediction modes on thebasis of images supplied from the screen rearrangement buffer 32 andreference images read out from the frame memory 44 through the switch45.

The motion prediction/compensation unit 47 supplies, to the interprediction unit 51, PU positions in frames, PU sizes, predictiondirections, reference image information, motion information, and thelike that correspond to the detected motion information as interprediction parameters.

The inter prediction unit 51 generates predicted images by BIOprocessing-included Bi prediction, for example, by using interprediction parameters supplied from the motion prediction/compensationunit 47.

The motion prediction/compensation unit 47 calculates RD costs in allthe candidate inter prediction modes on the basis of images suppliedfrom the screen rearrangement buffer 32 and predicted images generatedby the inter prediction unit 51. The motion prediction/compensation unit47 determines an inter prediction mode having the minimum RD cost as anoptimal inter prediction mode.

The RD cost and the predicted image in the determined optimal interprediction mode are output to the predicted image selection unit 48. Theinter prediction parameters in the determined optimal inter predictionmode are output to the lossless encoding unit 36.

The predicted image selection unit 48 determines, as an optimalprediction mode, one of an optimal intra prediction mode supplied fromthe intra prediction unit 46 and an optimal inter prediction modesupplied from the motion prediction/compensation unit 47 that has asmaller RD cost than the other. Then, the predicted image selection unit48 outputs the predicted image in the optimal prediction mode to thecalculation unit 33 and the addition unit 40.

The rate control unit 49 controls the rate of the quantization operationby the quantization unit 35 on the basis of encoded data accumulated inthe accumulation buffer 37 so that neither overflow nor underflowoccurs.

<Operation of Encoding Device>

FIG. 10 and FIG. 11 are flowcharts illustrating the details of encodingprocessing by the encoding device.

In Step S31 of FIG. 10, the A/D conversion unit 31 performs A/Dconversion on images in units of frames input to be encoded. The A/Dconversion unit 31 outputs the images that are now the digital signalsafter the conversion to the screen rearrangement buffer 32 and storesthe digital signals therein.

In Step S32, the screen rearrangement buffer 32 rearranges the frameimages stored in a display order into an encoding order on the basis ofthe GOP structure. The screen rearrangement buffer 32 outputs therearranged images in units of frames to the calculation unit 33, theintra prediction unit 46, and the motion prediction/compensation unit47.

In Step S33, the intra prediction unit 46 performs intra predictionprocessing in all candidate intra prediction modes. Further, the intraprediction unit 46 calculates RD costs in all the candidate intraprediction modes on the basis of the image read out from the screenrearrangement buffer 32 and predicted images generated by the intraprediction processing. The intra prediction unit 46 determines an intraprediction mode having the minimum RD cost as an optimal intraprediction mode. The intra prediction unit 46 outputs the predictedimage generated in the optimal intra prediction mode to the predictedimage selection unit 48.

In Step S34, the motion prediction/compensation unit 47 performs motionprediction/compensation processing in all candidate inter predictionmodes.

The motion prediction/compensation unit 47 detects motion information(motion vectors) in all the candidate inter prediction modes on thebasis of the image supplied from the screen rearrangement buffer 32 andreference images read out from the frame memory 44 through the switch45.

The inter prediction unit 51 generates predicted images by BIOprocessing-included Bi prediction, for example, by using interprediction parameters supplied from the motion prediction/compensationunit 47.

The motion prediction/compensation unit 47 calculates RD costs in allthe candidate inter prediction modes on the basis of the image suppliedfrom the screen rearrangement buffer 32 and the predicted imagesgenerated by the inter prediction unit 51. The motionprediction/compensation unit 47 determines an inter prediction modehaving the minimum RD cost as an optimal inter prediction mode.

The RD cost and the predicted image in the determined optimal interprediction mode are output to the predicted image selection unit 48. Theinter prediction parameters in the determined optimal inter predictionmode are output to the lossless encoding unit 36.

In Step S35, the predicted image selection unit 48 determines, as anoptimal prediction mode, one of the optimal intra prediction mode andthe optimal inter prediction mode that has a smaller RD cost than theother. Then, the predicted image selection unit 48 outputs the predictedimage in the optimal prediction mode to the calculation unit 33 and theaddition unit 40.

In Step S36, the predicted image selection unit 48 determines whetherthe optimal prediction mode is the optimal inter prediction mode. In acase where it is determined in Step S36 that the optimal prediction modeis the optimal inter prediction mode, the predicted image selection unit48 notifies the motion prediction/compensation unit 47 that thepredicted image generated in the optimal inter prediction mode has beenselected.

Then, in Step S37, the motion prediction/compensation unit 47 outputsthe inter prediction mode information and the inter predictionparameters to the lossless encoding unit 36. After that, the processingproceeds to Step S39.

Meanwhile, in a case where the optimal prediction mode is the optimalintra prediction mode in Step S36, the predicted image selection unit 48notifies the intra prediction unit 46 that the predicted image generatedin the optimal intra prediction mode has been selected. Then, in StepS38, the intra prediction unit 46 outputs the intra prediction modeinformation to the lossless encoding unit 36. After that, the processingproceeds to Step S39.

In Step S39, the calculation unit 33 subtracts the predicted imagesupplied from the predicted image selection unit 48 from the imagesupplied from the screen rearrangement buffer 32, to thereby performencoding. The calculation unit 33 outputs the image obtained as a resultof the subtraction as residual information to the orthogonal transformunit 34.

In Step S40, the orthogonal transform unit 34 performs orthogonaltransform processing on the residual information. The orthogonaltransform unit 34 outputs the orthogonal transform processing resultobtained as a result of the orthogonal transform processing to thequantization unit 35.

In Step S41, the quantization unit 35 quantizes the orthogonal transformprocessing result supplied from the orthogonal transform unit 34. Thequantization unit 35 outputs the quantized value obtained as a result ofthe quantization to the lossless encoding unit 36 and the inversequantization unit 38.

In Step S42 of FIG. 11, the inverse quantization unit 38 inverselyquantizes the quantized value from the quantization unit 35. The inversequantization unit 38 outputs the orthogonal transform processing resultobtained as a result of the inverse quantization to the inverseorthogonal transform unit 39.

In Step S43, the inverse orthogonal transform unit 39 performs inverseorthogonal transform processing on the orthogonal transform processingresult. The inverse orthogonal transform unit 39 outputs the residualinformation obtained as a result of the inverse orthogonal transformprocessing to the addition unit 40.

In Step S44, the addition unit 40 adds the residual information suppliedfrom the inverse orthogonal transform unit 39 and the predicted imagesupplied from the predicted image selection unit 48 together, to therebyperform decoding. The addition unit 40 outputs the decoded image to thedeblocking filter 41 and the frame memory 44.

In Step S45, the deblocking filter 41 performs deblocking filterprocessing on the image supplied from the addition unit 40. Thedeblocking filter 41 outputs the image obtained as a result of thedeblocking filter processing to the adaptive offset filter 42.

In Step S46, the adaptive offset filter 42 performs adaptive offsetfilter processing on the image obtained as a result of the deblockingfilter processing. The adaptive offset filter 42 outputs the imageobtained as a result of the adaptive offset filter processing to theadaptive loop filter 43. Further, the adaptive offset filter 42 outputsthe offset filter information to the lossless encoding unit 36.

In Step S47, the adaptive loop filter 43 performs adaptive loop filterprocessing on the image obtained as a result of the adaptive offsetfilter processing. The adaptive loop filter 43 outputs the imageobtained as a result of the adaptive loop filter processing to the framememory 44. Further, the adaptive loop filter 43 outputs the filtercoefficients used in the adaptive loop filter processing to the losslessencoding unit 36.

In Step S48, the frame memory 44 accumulates the image supplied from theadaptive loop filter 43 and the image supplied from the addition unit40. Of the images accumulated in the frame memory 44 without beingsubjected to the filter processing, images neighboring the CUs areoutput as peripheral images to the intra prediction unit 46 through theswitch 45. Meanwhile, the images subjected to the filter processing tobe accumulated in the frame memory 44 are output as reference images tothe motion prediction/compensation unit 47 through the switch 45.

In Step S49, the lossless encoding unit 36 losslessly encodes, asencoding information, the intra prediction mode information or the interprediction mode information, the inter prediction parameters, the offsetfilter information, or the filter coefficients.

In Step S50, the lossless encoding unit 36 losslessly encodes thequantized value supplied from the quantization unit 35. Then, thelossless encoding unit 36 generates encoded data from the encodinginformation losslessly encoded by the processing in Step S49 and thelossless-encoded quantized value and outputs the encoded data to theaccumulation buffer 37.

In Step S51, the accumulation buffer 37 temporarily accumulates theencoded data supplied from the lossless encoding unit 36.

In Step S52, the rate control unit 49 controls the rate of thequantization operation by the quantization unit 35 on the basis of theencoded data accumulated in the accumulation buffer 37 so that neitheroverflow nor underflow occurs. After that, the encoding processing ends.

Note that, in the encoding processing of FIG. 10 and FIG. 11, for thesake of simple description, the intra prediction processing and themotion prediction/compensation processing are always performed, but inreality, only one of the intra prediction processing and the motionprediction/compensation processing may be performed depending on picturetypes or the like.

<Configuration Example of Decoding Device>

FIG. 12 is a block diagram illustrating a configuration example of anembodiment of a decoding device to which the present disclosure isapplied, which decodes encoded streams transmitted from the encodingdevice of FIG. 9.

A decoding device 101 of FIG. 12 includes an accumulation buffer 131, alossless decoding unit 132, an inverse quantization unit 133, an inverseorthogonal transform unit 134, an addition unit 135, a deblocking filter136, an adaptive offset filter 137, an adaptive loop filter 138, and ascreen rearrangement buffer 139. Further, the decoding device 101includes a D/A conversion unit 140, a frame memory 141, a switch 142, anintra prediction unit 143, the inter prediction unit 51, and a switch144.

The accumulation buffer 131 of the decoding device 101 receives encodeddata transmitted as encoded streams from the encoding device 1 of FIG. 9and accumulates the encoded data. The accumulation buffer 131 outputsthe accumulated encoded data to the lossless decoding unit 132.

The lossless decoding unit 132 performs lossless decoding such asvariable length decoding or arithmetic decoding on encoded data from theaccumulation buffer 131, to thereby obtain quantized values and encodinginformation. The lossless decoding unit 132 outputs the quantized valuesto the inverse quantization unit 133. The encoding information includesintra prediction mode information, inter prediction mode information,inter prediction parameters, offset filter information, filtercoefficients, or the like.

Further, the lossless decoding unit 132 outputs the intra predictionmode information and the like to the intra prediction unit 143. Thelossless decoding unit 132 outputs the inter prediction parameters, theinter prediction mode information, and the like to the inter predictionunit 51.

The lossless decoding unit 132 outputs the intra prediction modeinformation or the inter prediction mode information to the switch 144.The lossless decoding unit 132 outputs the offset filter information tothe adaptive offset filter 137. The lossless decoding unit 132 outputsthe filter coefficients to the adaptive loop filter 138.

The inverse quantization unit 133, the inverse orthogonal transform unit134, the addition unit 135, the deblocking filter 136, the adaptiveoffset filter 137, the adaptive loop filter 138, the frame memory 141,the switch 142, the intra prediction unit 143, and the inter predictionunit 51 perform processing processes similar to those of the inversequantization unit 38, the inverse orthogonal transform unit 39, theaddition unit 40, the deblocking filter 41, the adaptive offset filter42, the adaptive loop filter 43, the frame memory 44, the switch 45, theintra prediction unit 46, and the motion prediction/compensation unit 47of FIG. 9. With this, images are decoded.

Specifically, the inverse quantization unit 133 is configured like theinverse quantization unit 38 of FIG. 9. The inverse quantization unit133 inversely quantizes quantized values from the lossless decoding unit132. The inverse quantization unit 133 outputs the orthogonal transformprocessing results obtained as a result of the inverse quantization tothe inverse orthogonal transform unit 134.

The inverse orthogonal transform unit 134 is configured like the inverseorthogonal transform unit 39 of FIG. 9. The inverse orthogonal transformunit 134 performs inverse orthogonal transform processing on orthogonaltransform processing results supplied from the inverse quantization unit133. The inverse orthogonal transform unit 134 outputs the residualinformation obtained as a result of the inverse orthogonal transformprocessing to the addition unit 135.

The addition unit 135 adds residual information supplied from theinverse orthogonal transform unit 134 and predicted images supplied fromthe switch 144 together, to thereby perform decoding. The addition unit135 outputs the decoded images to the deblocking filter 136 and theframe memory 141.

The deblocking filter 136 performs deblocking filter processing onimages supplied from the addition unit 135 and outputs the imagesobtained as a result of the deblocking filter processing to the adaptiveoffset filter 137.

The adaptive offset filter 137 performs, by using offsets indicated byoffset filter information from the lossless decoding unit 132, adaptiveoffset filter processing of types indicated by the offset filterinformation on images obtained as a result of deblocking filterprocessing. The adaptive offset filter 137 outputs the images obtainedas a result of the adaptive offset filter processing to the adaptiveloop filter 138.

The adaptive loop filter 138 performs adaptive loop filter processing onimages supplied from the adaptive offset filter 137 by using filtercoefficients supplied from the lossless decoding unit 132. The adaptiveloop filter 138 outputs the images obtained as a result of the adaptiveloop filter processing to the frame memory 141 and the screenrearrangement buffer 139.

The screen rearrangement buffer 139 stores images obtained as a resultof adaptive loop filter processing in units of frames. The screenrearrangement buffer 139 rearranges the images in units of frames in theencoding order into the original display order and outputs the resultantto the D/A conversion unit 140.

The D/A conversion unit 140 performs D/A conversion on images in unitsof frames supplied from the screen rearrangement buffer 139 and outputsthe resultant.

The frame memory 141 accumulates images obtained as a result of adaptiveloop filter processing and images supplied from the addition unit 135.Of the images accumulated in the frame memory 141 without beingsubjected to the filter processing, images neighboring the CUs aresupplied as peripheral images to the intra prediction unit 143 throughthe switch 142. Meanwhile, the images subjected to the filter processingto be accumulated in the frame memory 141 are output as reference imagesto the inter prediction unit 51 through the switch 142.

The intra prediction unit 143 performs, by using peripheral images readout from the frame memory 141 through the switch 142, intra predictionprocessing in an optimal intra prediction mode indicated by intraprediction mode information supplied from the lossless decoding unit132. The intra prediction unit 143 outputs the thus generated predictedimages to the switch 144.

The inter prediction unit 51 is configured like the one in FIG. 9. Theinter prediction unit 51 performs, by using inter prediction parameterssupplied from the lossless decoding unit 132, inter prediction in anoptimal inter prediction mode indicated by inter prediction modeinformation, to thereby generate a predicted image.

The inter prediction unit 51 reads out, from the frame memory 141through the switch 142, reference images specified by reference imageinformation that is an inter prediction parameter supplied from thelossless decoding unit 132. The inter prediction unit 51 generatespredicted images with BIO processing-included Bi prediction, forexample, by using motion information that is an inter predictionparameter supplied from the lossless decoding unit 132 and the read-outreference images. The generated predicted images are output to theswitch 144.

The switch 144 outputs, in a case where intra prediction modeinformation has been supplied from the lossless decoding unit 132,predicted images supplied from the intra prediction unit 143 to theaddition unit 135. Meanwhile, the switch 144 outputs, in a case whereinter prediction mode information has been supplied from the losslessdecoding unit 132, predicted images supplied from the inter predictionunit 51 to the addition unit 135.

<Operation of Decoding Device>

FIG. 13 is a flowchart illustrating the details of decoding processingby the decoding device.

In Step S131 of FIG. 13, the accumulation buffer 131 of the decodingdevice 101 receives encoded data in units of frames supplied from thepreceding stage, which is not illustrated, and accumulates the encodeddata. The accumulation buffer 131 outputs the accumulated encoded datato the lossless decoding unit 132.

In Step S132, the lossless decoding unit 132 losslessly decodes theencoded data from the accumulation buffer 131 to obtain a quantizedvalue and encoding information. The lossless decoding unit 132 outputsthe quantized value to the inverse quantization unit 133.

The lossless decoding unit 132 outputs intra prediction mode informationand the like to the intra prediction unit 143. The lossless decodingunit 132 outputs inter prediction parameters, inter prediction modeinformation, and the like to the inter prediction unit 51.

Further, the lossless decoding unit 132 outputs the intra predictionmode information or the inter prediction mode information to the switch144. The lossless decoding unit 132 supplies offset filter informationto the adaptive offset filter 137 and outputs filter coefficients to theadaptive loop filter 138.

In Step S133, the inverse quantization unit 133 inversely quantizes thequantized value supplied from the lossless decoding unit 132. Theinverse quantization unit 133 outputs the orthogonal transformprocessing result obtained as a result of the inverse quantization tothe inverse orthogonal transform unit 134.

In Step S134, the inverse orthogonal transform unit 134 performsorthogonal transform processing on the orthogonal transform processingresult supplied from the inverse quantization unit 133.

In Step S135, the inter prediction unit 51 determines whether the interprediction mode information has been supplied from the lossless decodingunit 132. In a case where it is determined in Step S135 that the interprediction mode information has been supplied, the processing proceedsto Step S136.

In Step S136, the inter prediction unit 51 reads out reference images onthe basis of reference image specification information supplied from thelossless decoding unit 132, and performs, by using motion informationand the reference images, motion compensation processing in an optimalinter prediction mode indicated by the inter prediction modeinformation. For example, the inter prediction unit 51 generates apredicted image with BIO processing-included Bi prediction. The interprediction unit 51 outputs the generated predicted image to the additionunit 135 through the switch 144. After that, the processing proceeds toStep S138.

Meanwhile, in a case where it is determined in Step S135 that the interprediction mode information has not been supplied, that is, in a casewhere the intra prediction mode information has been supplied to theintra prediction unit 143, the processing proceeds to Step S137.

In Step S137, the intra prediction unit 143 performs, by usingperipheral images read out from the frame memory 141 through the switch142, intra prediction processing in an intra prediction mode indicatedby the intra prediction mode information. The intra prediction unit 143outputs the predicted image generated as a result of the intraprediction processing to the addition unit 135 through the switch 144.After that, the processing proceeds to Step S138.

In Step S138, the addition unit 135 adds residual information suppliedfrom the inverse orthogonal transform unit 134 and the predicted imagesupplied from the switch 144 together, to thereby perform decoding. Theaddition unit 135 outputs the decoded image to the deblocking filter 136and the frame memory 141.

In Step S139, the deblocking filter 136 performs deblocking filterprocessing on the image supplied from the addition unit 135 to removeblock deformation. The deblocking filter 136 outputs the image obtainedas a result of the deblocking filter processing to the adaptive offsetfilter 137.

In Step S140, the adaptive offset filter 137 performs, on the basis ofthe offset filter information supplied from the lossless decoding unit132, adaptive offset filter processing on the image obtained as a resultof the deblocking filter processing. The adaptive offset filter 137outputs the image obtained as a result of the adaptive offset filterprocessing to the adaptive loop filter 138.

In Step S141, the adaptive loop filter 138 performs, by using the filtercoefficients supplied from the lossless decoding unit 132, adaptive loopfilter processing on the image supplied from the adaptive offset filter137. The adaptive loop filter 138 supplies the image obtained as aresult of the adaptive loop filter processing to the frame memory 141and the screen rearrangement buffer 139.

In Step S142, the frame memory 141 accumulates the image supplied fromthe addition unit 135 and the image supplied from the adaptive loopfilter 138. Of the images accumulated in the frame memory 141 withoutbeing subjected to the filter processing, images neighboring the CUs aresupplied as peripheral images to the intra prediction unit 143 throughthe switch 142. Meanwhile, the images subjected to the filter processingto be accumulated in the frame memory 141 are supplied as referenceimages to the inter prediction unit 51 through the switch 142.

In Step S143, the screen rearrangement buffer 139 stores the imagessupplied from the adaptive loop filter 138 in units of frames. Thescreen rearrangement buffer 139 rearranges the images in units of framesin the encoding order into the original display order and outputs theresultant to the D/A conversion unit 140.

In Step S144, the D/A conversion unit 140 performs D/A conversion on theimage obtained as a result of the adaptive loop filter processing andoutputs the resultant.

<Configuration Example of Inter Prediction Unit>

FIG. 14 is a block diagram illustrating a configuration example of theinter prediction unit.

In FIG. 14, the inter prediction unit 51 includes an inter predictioncontrol unit 201, an L0 prediction block generation unit 202, an L1prediction block generation unit 203, a BIO cost calculation unit 204, aBIO application determination unit 205, a Bi prediction block generationunit 206, a BIO processing-included Bi prediction block generation unit207, a Bi prediction block selection unit 208, and a prediction blockselection unit 209.

The inter prediction control unit 201 receives, in the case of theencoding device 1, inter prediction parameters from the motionprediction/compensation unit 47 (from the lossless decoding unit 132 inthe case of the decoding device 101).

The inter prediction parameters include a PU position in a frame, a PUsize, a prediction direction (any one of L0, L1, and Bi is set),reference image information, motion information, and the like.

The inter prediction control unit 201 includes, for example, a CPU(Central Processing Unit) or a microprocessor. The inter predictioncontrol unit 201 executes a predetermined program by the CPU to controlthe units on the basis of the contents of inter prediction parameters.

The inter prediction control unit 201 supplies L0 prediction parametersto the L0 prediction block generation unit 202, thereby controlling theL0 prediction block generation unit 202. The L0 prediction parametersinclude PU positions, PU sizes, reference image information REFIDX_L0,and motion information MV_L0.

The inter prediction control unit 201 supplies L1 prediction parametersto the L1 prediction block generation unit 203, thereby controlling theL1 prediction block generation unit 203. The L1 prediction parametersinclude PU positions, PU sizes, reference image information REFIDX_L1,and motion information MV L1.

The inter prediction control unit 201 supplies Bi prediction parametersto the BIO cost calculation unit 204, the Bi prediction block generationunit 206, and the BIO processing-included Bi prediction block generationunit 207, thereby controlling the BIO cost calculation unit 204, the Biprediction block generation unit 206, and the BIO processing-included Biprediction block generation unit 207. The Bi prediction parametersinclude PU sizes and the like.

The inter prediction control unit 201 supplies a BIO threshold to theBIO application determination unit 205, thereby controlling the BIOapplication determination unit 205.

The inter prediction control unit 201 supplies a prediction direction tothe prediction block selection unit 209, thereby controlling theprediction block selection unit 209.

The L0 prediction block generation unit 202 operates when the predictiondirection is L0 or Bi. The L0 prediction block generation unit 202accesses the frame memory 44 on the basis of L0 prediction parameterssupplied from the inter prediction control unit 201, to thereby generateL0 prediction images from reference images. The generated L0 predictionimages are supplied from the L0 prediction block generation unit 202 tothe BIO cost calculation unit 204, the BIO application determinationunit 205, the Bi prediction block generation unit 206, the BIOprocessing-included Bi prediction block generation unit 207, and theprediction block selection unit 209.

The L1 prediction block generation unit 203 operates when the predictiondirection is L1 or Bi. The L1 prediction block generation unit 203accesses the frame memory 44 on the basis of L1 prediction parameterssupplied from the inter prediction control unit 201, to thereby generateL1 prediction images from reference images. The generated L1 predictionimages are supplied from the L1 prediction block generation unit 203 tothe BIO cost calculation unit 204, the BIO application determinationunit 205, the Bi prediction block generation unit 206, the BIOprocessing-included Bi prediction block generation unit 207, and theprediction block selection unit 209.

The BIO cost calculation unit 204 operates when the prediction directionis Bi. The BIO cost calculation unit 204 calculates, on the basis of Biprediction parameters supplied from the inter prediction control unit201, the SAD of an L0 prediction image supplied from the L0 predictionblock generation unit 202 and an L1 prediction image supplied from theL1 prediction block generation unit 203. The calculated SAD is suppliedfrom the BIO cost calculation unit 204 to the BIO applicationdetermination unit 205.

The BIO application determination unit 205 operates when the predictiondirection is Bi. The BIO application determination unit 205 compares theBIO threshold supplied from the inter prediction control unit 201 to aSAD supplied from the BIO cost calculation unit 204, thereby determininga BIO_ON flag. When the SAD is larger than the BIO threshold, the BIO_ONflag is determined to BIO_ON=1 that indicates the application of BIO,and when the SAD is smaller than the BIO threshold, the BIO_ON flag isdetermined to BIO_ON=0 that indicates the prohibition of the applicationof BIO.

The determined BIO_ON flag is supplied from the BIO applicationdetermination unit 205 to the Bi prediction block generation unit 206,the BIO processing-included Bi prediction block generation unit 207, andthe Bi prediction block selection unit 208.

The Bi prediction block generation unit 206 operates on the basis of theBIO_ON flag supplied from the BIO application determination unit 205when the prediction direction is Bi and BIO_ON=0 holds. The Biprediction block generation unit 206 generates, on the basis of Biprediction parameters supplied from the inter prediction control unit201, Bi prediction images from L0 prediction images supplied from the L0prediction block generation unit 202 and L1 prediction images suppliedfrom the L1 prediction block generation unit 203. The generated Biprediction images are supplied from the Bi prediction block generationunit 206 to the Bi prediction block selection unit 208.

The BIO processing-included Bi prediction block generation unit 207operates on the basis of the BIO ON flag supplied from the BIOapplication determination unit 205 when the prediction direction is Biand BIO_ON=1 holds. The Bi prediction block generation unit 206generates, on the basis of Bi prediction parameters supplied from theinter prediction control unit 201, BIO processing-included Bi predictionimages from L0 prediction images supplied from the L0 prediction blockgeneration unit 202 and L1 prediction images supplied from the L1prediction block generation unit 203. The generated BIOprocessing-included Bi prediction images are supplied from the BIOprocessing-included Bi prediction block generation unit 207 to the Biprediction block selection unit 208.

The Bi prediction block selection unit 208 selects Bi prediction imageson the basis of the BIO_ON flag supplied from the BIO applicationdetermination unit 205. The Bi prediction block selection unit 208selects Bi prediction images supplied from the Bi prediction blockgeneration unit 206 in a case where BIO_ON=0 holds, and selects BIOprocessing-included Bi prediction images supplied from the BIOprocessing-included Bi prediction block generation unit 207 in a casewhere BIO_ON=1 holds. The selected Bi prediction images are suppliedfrom the Bi prediction block selection unit 208 to the prediction blockselection unit 209.

The prediction block selection unit 209 selects predicted images on thebasis of a prediction direction supplied from the inter predictioncontrol unit 201 and outputs the selected predicted images as thepredicted images of inter prediction to the predicted image selectionunit 48 of FIG. 9 (or the switch 144 of FIG. 12) on the subsequentstage.

The prediction block selection unit 209 selects L0 prediction imagessupplied from the L0 prediction block generation unit 202 in a casewhere the prediction direction is L0, and selects L1 prediction imagessupplied from the L1 prediction block generation unit 203 in a casewhere the prediction direction is L1. The prediction block selectionunit 209 selects Bi prediction images supplied from the Bi predictionblock selection unit 208 in a case where the prediction direction is Bi.

<Operation Example of Inter Prediction Unit>

FIG. 15 and FIG. 16 are flowcharts illustrating BIO-included Biprediction that is performed by the inter prediction unit 51.

Note that, this processing is related-art BIO-included Bi predictionprocessing that is compared to BIO-included Bi prediction processing ofthe present technology described later. Further, this BIO-included Biprediction processing is processing that is performed on the encodingside and the decoding side, is part of the motionprediction/compensation processing in Step S34 of FIG. 10, and is partof the inter prediction processing in Step S136 of FIG. 13.

In Step S301 of FIG. 15, the inter prediction control unit 201 acquiresinter prediction parameters supplied from the motionprediction/compensation unit 47. Note that, in the case of the decodingdevice 101, the inter prediction parameters are supplied from thelossless decoding unit 132.

The inter prediction parameters include a PU position in a frame, a PUsize, a prediction direction (any one of L0, L1, and Bi is set),reference image information, motion information, and the like.

The inter prediction control unit 201 supplies L0 prediction parametersto the L0 prediction block generation unit 202. The L0 predictionparameters include a PU position, a PU size, reference image informationREFIDX_L0, and motion information MV_L0. The inter prediction controlunit 201 supplies L1 prediction parameters to the L1 prediction blockgeneration unit 203. The L1 prediction parameters include a PU position,a PU size, reference image information REFIDX_L1, and motion informationMV_L1.

The inter prediction control unit 201 supplies Bi prediction parametersto the BIO cost calculation unit 204, the Bi prediction block generationunit 206, and the BIO processing-included Bi prediction block generationunit 207. The Bi prediction parameters are information indicating PUsizes.

The inter prediction control unit 201 supplies the BIO threshold to theBIO application determination unit 205. The inter prediction controlunit 201 supplies a prediction direction to the prediction blockselection unit 209, thereby controlling the prediction block selectionunit 209.

In Step S302, the L0 prediction block generation unit 202 accesses theframe memory 44 on the basis of the L0 prediction parameters suppliedfrom the inter prediction control unit 201, to thereby generate an L0prediction image from a reference image. Note that, in the case of thedecoding device 101, the reference image is referred to through anaccess to the frame memory 141.

In Step S303, the L1 prediction block generation unit 203 accesses theframe memory 44 on the basis of the L1 prediction parameters suppliedfrom the inter prediction control unit 201, to thereby generate an L1prediction image from a reference image.

The maximum buffer size in the processing in Steps S302 and S303 is aPU′ size. The PU′ size represents a size that corresponds to the PU sizeand is slightly larger than the PU size.

In Step S304, the BIO cost calculation unit 204 calculates, in units of4×4, the SAD of the L0 prediction image supplied from the L0 predictionblock generation unit 202 and the L1 prediction image supplied from theL1 prediction block generation unit 203. The SADs calculated in units of4×4 are accumulated so that SAD 4×4 block that is the sum of the SADs isacquired.

In Step S305, the BIO cost calculation unit 204 calculates, in units ofPUs, the SAD of the L0 prediction image supplied from the L0 predictionblock generation unit 202 and the L1 prediction image supplied from theL1 prediction block generation unit 203. The SADs calculated in units ofPUs are accumulated so that SAD PU that is the sum of the SADs isacquired. The acquired SAD PU is supplied from the BIO cost calculationunit 204 to the BIO application determination unit 205.

In Step S306, the BIO application determination unit 205 determines aBIO_PU_ON flag on the basis of SAD_PU>=BIO threshold PU. SAD PU issupplied from the BIO cost calculation unit 204 and BIO threshold PU issupplied from the inter prediction control unit 201. The determinedBIO_PU_ON flag is supplied from the BIO application determination unit205 to the Bi prediction block generation unit 206, the BIOprocessing-included Bi prediction block generation unit 207, and the Biprediction block selection unit 208.

When the SAD is larger than the BIO threshold, the BIO_PU_ON flag isdetermined to BIO_PU_ON=1 that indicates the application of BIO, andwhen the SAD is smaller than the BIO threshold, the BIO_PU_ON flag isdetermined to BIO_PU_ON=0 that indicates the prohibition of theapplication of BIO.

In Step S307, the Bi prediction block generation unit 206 and the BIOprocessing-included Bi prediction block generation unit 207 determinewhether or not the BIO_PU_ON flag is 1.

In a case where it is determined in Step S307 that the BIO_PU_ON flag isnot 1, the processing proceeds to Step S308.

In Step S308, the Bi prediction block generation unit 206 generates a Biprediction block PU from the L0 prediction image supplied from the L0prediction block generation unit 202 and the L1 prediction imagesupplied from the L1 prediction block generation unit 203. The generatedBi prediction block PU is supplied from the Bi prediction blockgeneration unit 206 to the Bi prediction block selection unit 208. Afterthat, the BIO-included Bi prediction processing ends.

The maximum buffer size in the processing in Step S308 is the PU size.

Meanwhile, in a case where it is determined in Step S307 that theBIO_PU_ON flag is 1, the processing proceeds to Step S309.

In Steps S309 to 5320, the BIO processing-included Bi prediction blockgeneration unit 207 performs the processing of generating a BIOprocessing-included Bi prediction image.

In Step S309, the BIO processing-included Bi prediction block generationunit 207 calculates a plurality of gradients from the L0 predictionimage supplied from the L0 prediction block generation unit 202 and theL1 prediction image supplied from the L1 prediction block generationunit 203. The maximum buffer size in the processing in Step S309 is thetotal size of nine PU's.

In Step S310, the BIO processing-included Bi prediction block generationunit 207 acquires the number of 4×4 blocks included in the PU.

In Step S311, the BIO processing-included Bi prediction block generationunit 207 sets 0 to the 4×4 block number.

In Step S312 of FIG. 16, the BIO processing-included Bi prediction blockgeneration unit 207 determines whether or not the 4×4 block number issmaller than the number of 4×4 blocks.

In a case where it is determined in Step S312 that the 4×4 block numberis smaller than the number of 4×4 blocks, the processing proceeds toStep S313.

In Step S313, the BIO processing-included Bi prediction block generationunit 207 acquires the position in the PU and SAD 4×4 from the 4×4 blocknumber.

In Step S314, the BIO processing-included Bi prediction block generationunit 207 determines BIO_4×4_ON on the basis of SAD_4×4>=BIOthreshold_4×4.

In Step S315, the BIO processing-included Bi prediction block generationunit 207 determines whether or not the BIO_4×4_ON flag is 1.

In a case where it is determined in Step S315 that the BIO_4×4_ON flagis not 1, the processing proceeds to Step S316.

In Step S316, the BIO processing-included Bi prediction block generationunit 207 generates a Bi prediction value from the L0 prediction imageand the L1 prediction image in the region of the 4×4 block number.

In a case where it is determined in Step S315 that the BIO_4×4_ON flagis 1, the processing proceeds to Step S317.

In Step S317, the BIO processing-included Bi prediction block generationunit 207 calculates a velocity from the plurality of gradients in theregion of

In Step S318, the BIO processing-included Bi prediction block generationunit 207 generates a BIO prediction value from the L0 prediction image,the L1 prediction image, the gradients, and the velocity in the regionof the 4×4 block number.

After Steps S316 and 5318, the processing proceeds to Step S319.

In Step S319, the BIO processing-included Bi prediction block generationunit 207 stores the prediction value at the position of the 4×4 blocknumber in the buffer. The maximum buffer size in the processing in Step319 is the PU size.

In Step S320, the BIO processing-included Bi prediction block generationunit 207 increments the 4×4 block number. After that, the processingreturns to Step S312, and the later processing is repeated.

After Step S308 or in a case where it is determined in Step S312 thatthe 4×4 block number is not smaller than the number of 4×4 blocks, theBIO-included Bi prediction ends.

Note that, in the BIO-included Bi processing described above, the SAD ofthe L0 prediction block and the L1 prediction block is calculated forthe entire PU in Step S305, the SAD is compared to the threshold todetermine whether or not to apply BIO processing in Step S306, and theprocessing branches in Step S307.

Thus, it is difficult to virtually partition, in a case where interprediction is performed on PUs larger than VPDUs, the PU into aplurality of vPUs. As a result, the PU′ size, which is slightly largerthan the PU size, is required for the buffers required in Steps S302,S303, and S309 to achieve the gradient calculation in Step S309 and thevelocity calculation in Step S317. The maximum PU′ size is a size of130×130 obtained by adding 2 to the PU horizontal size and the PUvertical size.

Further, in Step S308, the buffer having the PU size is required. Thesemean that the BIO-included inter prediction unit 51 requires a largebuffer resource.

Further, in a case where the inter prediction unit 51 that requires thisbuffer is implemented by HW (hardware), due to a large differencebetween the pipeline delay of BIO-included inter prediction and thepipeline delay of TU processing, HW implementation that maintainsthroughput is difficult to achieve.

This affects both the encoding and decoding sides. On the encoding side,this can be avoided by a self-limiting process such as always splittingCUs into 64×64 or less. In order to secure the degree of freedom of theencoding side, however, a solution is desired. On the decoding side,which is required to meet the standard, a large HW resource isessential.

Accordingly, as described above, in the present technology, a unit ofprocessing in calculation of a cost that is used for determining whetheror not to perform bidirectional prediction such as BIO is partitionedinto partitioned processing units each of which corresponds to the VPDUsize or is equal to or smaller than the VPDU size, and the determinationis made by using the cost calculated on the basis of the partitionedprocessing units.

The size corresponding to the VPDU size means the VPDU′ size slightlylarger than the VPDU size.

<Operation Example of Inter Prediction Unit>

FIG. 17 and FIG. 18 are flowcharts illustrating, as an operation exampleaccording to the first embodiment of the present technology,BIO-included Bi prediction that is performed by the inter predictionunit 51.

The case of the encoding device 1 is illustrated in FIG. 17 and FIG. 18,and since similar processing is performed in the case of the decodingdevice 101, the description thereof is omitted.

In Step S401, the inter prediction control unit 201 acquires interprediction parameters supplied from the motion prediction/compensationunit 47.

In Step S402, the inter prediction control unit 201 acquires the numberof vPUs included in the PU. That is, in a case where PUs are larger thanVPDUs, the PU is virtually partitioned into a plurality of vPUs. In acase where the PU is 128×128, 4 is set to the number of vPUs. In a casewhere the PU is 128×64 or 64×128, 2 is set to the number of vPUs. In acase where the PU is 64×64 or less, 1 is set to the number of vPUs. Inthe case where the number of vPUs is 1, the PU is not virtuallypartitioned, and processing similar to that of FIG. 15 and FIG. 16 issubstantially performed.

In Step S403, the inter prediction control unit 201 sets 0 as a vPUnumber that is processed first.

In Step S404, the inter prediction control unit 201 determines whetheror not the vPU number is smaller than the number of vPUs.

In a case where it is determined in Step S404 that the vPU number issmaller than the number of vPUs, the processing proceeds to Step S405.

In Step S405, the inter prediction control unit 201 acquires, from thePU size and the vPU number, the position and size of the vPU indicatinga region in the PU to be processed.

FIG. 19 is a diagram illustrating the correspondences between PU size,vPU number, and processing position and size.

When the PU size is 128×128 and the vPU number is 0, the processingposition is at the upper left and the size is 64×64. When the vPU numberis 1, the processing position is at the upper right and the size is64×64. When the vPU number is 2, the processing position is at the lowerleft and the size is 64×64. When the vPU number is 3, the processingposition is at the lower right and the size is 64×64.

When the PU size is 128×64 and the vPU number is 0, the processingposition is on the left and the size is 64×64. When the vPU number is 1,the processing position is on the right and the size is 64×64.

When the PU size is 64×128 and the vPU number is 0, the processingposition is at the top and the size is 64×64. When the vPU number is 1,the processing position is at the bottom and the size is 64×64.

When the PU size is 64×64 or less and the vPU number is 0, theprocessing position is the PU itself.

Returning to FIG. 17, the position and size of the vPU acquired in StepS405 are supplied to the L0 prediction block generation unit 202 and theL1 prediction block generation unit 203.

In Step S406, the L0 prediction block generation unit 202 generates anL0 prediction block in the region of the vPU number.

In Step S407, the L1 prediction block generation unit 203 generates anL1 prediction block in the region of the vPU number.

The maximum buffer size in the processing in Steps 406 and S407 is, forexample, the VPDU′ size including a slightly large region that isrequired for the gradient calculation in Step S413 and the velocitycalculation in Step S421. The VPDU′ size represents the above-mentionedsize corresponding to the VPDU size, which is the size slightly largerthan the VPDU size. The VPDU′ size is 66×66 obtained by adding 2 to thehorizontal and vertical sizes, for example.

In the determination of BIO application on the subsequent stage, SADvalues up to the VPDU size are used, and hence the buffer size forstoring the L0 prediction block and L1 prediction block generated herecan be based on the VPDU size.

In Step S408, the BIO cost calculation unit 204 calculates, in units of4×4 in the vPU, the SAD of the L0 prediction image supplied from the L0prediction block generation unit 202 and the L1 prediction imagesupplied from the L1 prediction block generation unit 203. The SADscalculated in units of 4×4 are accumulated so that SAD_4×4 block that isthe sum of the SADs is acquired.

To determine whether to apply BIO by a 4×4 block unit, which is the unitin velocity calculation, to thereby achieve early termination fornon-effective cases on the subsequent stage, this SAD_4×4 block isrequired to be stored. However, the buffer size for storing SAD_4×4block can be reduced to ¼ of the size in Step S304 of FIG. 15.

In Step S409, the BIO cost calculation unit 204 calculates, in units ofvPUs, the SAD of the L0 prediction image supplied from the L0 predictionblock generation unit 202 and the L1 prediction image supplied from theL1 prediction block generation unit 203. The SADs calculated in units ofvPUs are accumulated so that SAD_vPU that is the sum of the SADs isacquired. The acquired SAD_vPU is supplied from the BIO cost calculationunit 204 to the BIO application determination unit 205.

In Step S410, the BIO application determination unit 205 determines theBIO_vPU_ON flag on the basis of SAD_vPU>=BIO threshold_vPU. SAD_vPU issupplied from the BIO cost calculation unit 204 and BIO threshold_vPU issupplied from the inter prediction control unit 201. BIO threshold_vPUis a value obtained by scaling BIO threshold_PU to a value based on thevPU size obtained in Step S405.

The determined BIO_vPU_ON flag is supplied from the BIO applicationdetermination unit 205 to the Bi prediction block generation unit 206,the BIO processing-included Bi prediction block generation unit 207, andthe Bi prediction block selection unit 208.

In Step S411, the Bi prediction block generation unit 206 and the BIOprocessing-included Bi prediction block generation unit 207 determinewhether or not the BIO_vPU_ON flag is 1.

In a case where it is determined in Step S411 that the BIO_vPU_ON flagis not 1, the processing proceeds to Step S412 since BIO is noteffective to the entire vPU.

In Step S412, the Bi prediction block generation unit 206 generates a Biprediction block vPU from the L0 prediction image supplied from the L0prediction block generation unit 202 and the L1 prediction imagesupplied from the L1 prediction block generation unit 203. The generatedBi prediction block vPU is stored in the buffer and supplied from the Biprediction block generation unit 206 to the Bi prediction blockselection unit 208.

In a case where the pipeline is structured in HW implementation, TUprocessing in units of VPDUs is executed in parallel to vPU interprediction, and hence next processing can start at this timing. Thus, itis enough that the buffer prepared here to store Bi prediction has themaximum VPDU size. After that, the processing proceeds to Step S425 ofFIG. 18.

Meanwhile, in a case where it is determined in Step S411 that theBIO_vPU_ON flag is 1, the processing proceeds to Step S413.

In Step S413, the BIO processing-included Bi prediction block generationunit 207 calculates a plurality of gradients from the L0 predictionblock supplied from the L0 prediction block generation unit 202 and theL1 prediction block supplied from the L1 prediction block generationunit 203.

In Step S413, 9 types of intermediate parameters are calculated from theL0 prediction block and the L1 prediction block. The amount of changebetween the L0 prediction block and the L1 prediction block, and theamount of horizontal or vertical change in pixel value in eachprediction block are calculated. These are collectively referred to as“gradient.” The gradients are required to be calculated by as manypixels as prediction blocks, and hence it is enough that the bufferrequired here has the total size of nine VPDU's at most.

In Step S414 of FIG. 18, the BIO processing-included Bi prediction blockgeneration unit 207 acquires the number of 4×4 blocks included in thevPU. For example, in the case of a vPU of 64×64, the number of 4×4blocks is 256. In the optical flow, the highest prediction accuracy isachieved when velocities are obtained in units of pixels to calculateprediction values. This, however, requires large-scale calculation. InBIO, velocities are calculated in units of 4×4 blocks in view of thebalanced trade-off of performance and cost.

In Step S415, the BIO processing-included Bi prediction block generationunit 207 sets 0 as a 4×4 block number that is processed first.

In Step S416, the BIO processing-included Bi prediction block generationunit 207 determines whether or not the 4×4 block number is smaller thanthe number of 4×4 blocks.

In a case where it is determined in Step S416 that the 4×4 block numberis smaller than the number of 4×4 blocks, the processing proceeds toStep S417.

In Step S417, the BIO processing-included Bi prediction block generationunit 207 acquires the position in the vPU and SAD_4×4 from the 4×4 blocknumber. The 4×4 blocks are processed in the raster scan order.

In Step S418, the BIO processing-included Bi prediction block generationunit 207 determines BIO_4×4_ON on the basis of SAD_4×4>=BIOthreshold_4×4.

In Step S419, the BIO processing-included Bi prediction block generationunit 207 determines whether or not the BIO_4×4_ON flag is 1.

In a case where it is determined in Step S419 that the BIO_4×4_ON flagis not 1, the processing proceeds to Step S420 since BIO is not expectedto be effective to the 4×4 block.

In Step S420, the BIO processing-included Bi prediction block generationunit 207 calculates the average of the L0 prediction image and the L1prediction image in the region of the 4×4 block number, to therebygenerate a Bi prediction value.

In a case where it is determined in Step S419 that the BIO_4×4_ON flagis 1, the processing proceeds to Step S421.

In Step S421, the BIO processing-included Bi prediction block generationunit 207 calculates a velocity from the plurality of gradients in theregion of the 4×4 block number.

In Step S422, the BIO processing-included Bi prediction block generationunit 207 generates a BIO prediction value from the L0 prediction image,the L1 prediction image, the gradients, and the velocity in the regionof the 4×4 block number.

After Steps S420 and S422, the processing proceeds to Step S423.

In Step S423, the BIO processing-included Bi prediction block generationunit 207 stores the prediction value generated in Step S420 or Step S422at the position of the 4×4 block number in the buffer. The maximumbuffer size in the processing in Step 423 is the VPDU size. The buffermay be the buffer that is used in the processing in S412.

In Step S424, the BIO processing-included Bi prediction block generationunit 207 increments the 4×4 block number. After that, the processingreturns to Step S416, and the later processing is repeated.

After Step S412 or in a case where it is determined in Step S416 thatthe 4×4 block number is equal to or larger than the number of 4×4blocks, the processing proceeds to Step S425.

In Step S425, the inter prediction control unit 201 increments the vPUnumber. The processing returns to Step S404, and the later processing isrepeated.

In a case where it is determined in Step S404 that the vPU number isequal to or larger than the number of vPUs, the BIO processing-includedBi prediction ends.

FIG. 20 and FIG. 21 are diagrams illustrating comparisons betweenrelated-art operation and operation according to the first embodiment ofthe present technology.

In the upper part of FIG. 20, the related-art operation and theoperation according to the first embodiment of the present technologyare illustrated in terms of ranges in which SADs have been calculatedfirst for BIO application determination in a case where the CU (PU) is128×128 and VPDU=64×64 holds. In the case of the CU (PU) of 128×128, theCU (PU) is partitioned into four vPUs that are SAD calculation regionsfor BIO_vPU_ON determination.

In the lower part of FIG. 20, the related-art operation and theoperation according to the first embodiment of the present technologyare illustrated in terms of ranges in which SADs have been calculatedfirst for BIO application determination in a case where the CU (PU) is128×64 and VPDU=64×64 holds. In the case of the CU (PU) of 128×64, theCU (PU) is partitioned into two left and right vPUs that are SADcalculation regions for BIO_vPU_ON determination.

In the upper part of FIG. 21, the related-art operation and theoperation according to the first embodiment of the present technologyare illustrated in terms of ranges in which SADs have been calculatedfirst for BIO application determination in a case where the CU (PU) is64×128 and VPDU=64×64 holds. In the case of the CU (PU) of 64×128, theCU (PU) is partitioned into two top and bottom vPUs that are SADcalculation regions for BIO_vPU_ON determination.

In the lower part of FIG. 21, the related-art operation and theoperation according to the first embodiment of the present technologyare illustrated in terms of ranges in which SADs have been calculatedfirst for BIO application determination in a case where the CU (PU) is64×64 or less and VPDU=64×64 holds. In the case of the CU (PU) of 64×64or less, the CU (PU) is not partitioned and includes a single vPU thatis a SAD calculation region for BIO_vPU_ON determination.

In the related-art operation, the SAD for the entire PU is required, andhence the large L0 prediction block and the large L1 prediction blockare required to be prepared and stored in advance. In the presenttechnology, on the other hand, in the PU larger than the VPDU, whetherto apply BIO is determined for each vPU obtained by virtuallypartitioning the PU, and the buffer for the L0 prediction block and theL1 prediction block prepared and stored in advance can therefore bereduced in size.

Further, the buffers that are used in Steps S412, S413, and S423 of FIG.17 and FIG. 18 can be reduced to ¼ of the buffers that are used in StepsS308, S309, and 5319 of FIG. 15 and FIG. 16.

As tools for generating two prediction blocks on the decoding side andmaking determination through cost calculation, to thereby enhance theencoding efficiency of inter prediction, such as BIO, there are FRUC(Frame Rate Up-Conversion) and DMVR (Decoder-side motion vectorrefinement). In FRUC and DMVR, L0 prediction blocks and L1 predictionblocks that are larger than a PU size are generated and SADs or similarcosts are calculated for the purpose of MV correction instead of earlytermination in BIO.

In a case where PUs are larger than VPDUs, processing similar to that inthe present technology is required. Also in FRUC and DMVR, as in thepresent technology, a case where PUs are larger than VPDUs can behandled as follows: the PU is virtually partitioned into a plurality ofvPUs, and MV correction is performed for each vPU.

The SAD calculation and BIO application determination for an entire PUin the related-art operation and the SAD calculation and BIO applicationdetermination for each vPU in the present technology, which are descriedabove, are generally mainly intended to achieve early termination, andhence a further reduction can be achieved.

FIRST MODIFIED EXAMPLE

In the first embodiment described above, the example in which in a casewhere PUs are larger than VPDUs, the PU is virtually partitioned into aplurality of vPUs, and a SAD is calculated to determine whether to applyBIO for each vPU is described. The vPUs of the PU are originallyincluded in the same PU, and hence it is conceivable that a certainpartial tendency is similar to the tendencies of the different portions.

FIG. 22 and FIG. 23 are diagrams illustrating, as a first modifiedexample, an example in which in a case where PUs are larger than VPDUs,a BIO determination result for a vPU number of 0 is also used for othervPUs on the premise of the tendency described above.

In the upper part of FIG. 22, there are illustrated ranges in which SADshave been calculated first for BIO application determination in a casewhere the CU (PU) is 128×128 and VPDU=64×64 holds. In the case of the CU(PU) of 128×128, of vPUs obtained by partitioning the CU (PU) into fouras SAD calculation regions for BIO_vPU_ON determination, a SAD for thevPU at the upper left (vPU number=0) is calculated, and the result forthe vPU having the vPU number of 0 is copied and used for the remainingvPUs (upper right, lower left, and upper right).

In the lower part of FIG. 22, there are illustrated ranges in which SADshave been calculated first for BIO application determination in a casewhere the CU (PU) is 128×64 and VPDU=64×64 holds. In the case of the CU(PU) of 128×64, of vPUs obtained by partitioning the CU (PU) into two asSAD calculation regions for BIO_vPU_ON determination, a SAD for the vPUon the left (vPU number=0) is calculated, and the result for the vPUhaving the vPU number of 0 is copied and used for the other vPU (right).

In the upper part of FIG. 23, there are illustrated ranges in which SADshave been calculated first for BIO application determination in a casewhere the CU (PU) is 64×128 and VPDU=64×64 holds. In the case of the CU(PU) of 128×64, of vPUs obtained by partitioning the CU (PU) into two asSAD calculation regions for BIO_vPU_ON determination, a SAD for the vPUat the top (vPU number=0) is calculated, and the result for the vPUhaving the vPU number of 0 is copied and used for the other vPU(bottom).

In the lower part of FIG. 23, there is illustrated a range in which aSAD has been calculated first for BIO application determination in acase where the CU (PU) is 64x64 or less and VPDU=64×64 holds. In thecase of the CU (PU) of 64×64 or less, the CU (PU) is not partitioned andincludes a single vPU as a SAD calculation region for BIO_vPU_ONdetermination.

<Operation Example of Inter Prediction Unit>

FIG. 24 and FIG. 25 are flowcharts illustrating BIO-included Biprediction in the case of FIG. 23.

In Steps S501 to S508 and Steps S510 to S526 of FIG. 24 and FIG. 25,processing basically similar to that in Steps S401 to S425 of FIG. 17and FIG. 18 is performed, and hence the description thereof, which isredundant, is appropriately omitted.

In Step S508 of FIG. 25, the BIO cost calculation unit 204 calculates,in units of 4×4 in the vPU, the SAD of the L0 prediction image suppliedfrom the L0 prediction block generation unit 202 and the L1 predictionimage supplied from the L1 prediction block generation unit 203. TheSADs calculated in units of 4×4 are accumulated so that SAD 4×4 blockthat is the sum of the SADs is acquired.

In Step S509, the BIO cost calculation unit 204 determines whether ornot the vPU number is 0.

In a case where it is determined in Step S509 that the vPU number is 0,the processing proceeds to Step S510.

In Step S510, the BIO cost calculation unit 204 calculates, in units ofvPUs, the SAD of the L0 prediction image supplied from the L0 predictionblock generation unit 202 and the L1 prediction image supplied from theL1 prediction block generation unit 203. The SADs calculated in units ofvPUs are accumulated so that SAD vPU that is the sum of the SADs isacquired. The acquired SAD vPU is supplied from the BIO cost calculationunit 204 to the BIO application determination unit 205.

In Step S511, the BIO application determination unit 205 determines theBIO_vPU_ON flag on the basis of SAD_vPU>=BIO threshold_vPU. SAD_vPU issupplied from the BIO cost calculation unit 204 and BIO threshold vPU issupplied from the inter prediction control unit 201. After that, theprocessing proceeds to Step S512.

Meanwhile, in a case where it is determined that the vPU number is not0, the processing skips Steps S510 and S511 and proceeds to Step S512.

As described above, in the PU, only for the vPUs that are positionedfirst in the raster scan order, the SAD accumulation and BIOdetermination for the vPUs are performed, with the result that theprocessing related to early termination and time taken for theprocessing can be reduced.

SECOND MODIFIED EXAMPLE

FIG. 26 and FIG. 27 are diagrams illustrating, as a second modifiedexample, an example in which whether to apply BIO is determined with apartial SAD value in each vPU.

In the upper part of FIG. 26, there are illustrated ranges in which SADshave been calculated first for BIO application determination in a casewhere the CU (PU) is 128×128 and VPDU=64×64 holds. In the case of the CU(PU) of 128×128, a SAD is calculated for an upper left partial region(32×32) of each vPU obtained by partitioning the CU (PU) into two as SADcalculation regions for BIO_vPU_ON determination.

In the lower part of FIG. 26, there are illustrated ranges in which SADshave been calculated first for BIO application determination in a casewhere the CU (PU) is 128×64 and VPDU=64×64 holds. In the case of the CU(PU) of 128×64, a SAD is calculated for an upper left partial region(32×32) of each vPU obtained by partitioning the CU (PU) into two as SADcalculation regions for BIO_vPU_ON determination.

In the upper part of FIG. 27, there are illustrated ranges in which SADshave been calculated first for BIO application determination in a casewhere the CU (PU) is 64×128 and VPDU=64×64 holds. In the case of the CU(PU) of 64×128, a SAD is calculated for an upper left partial region(32×32) of each vPU obtained by partitioning the CU (PU) into two as SADcalculation regions for BIO_vPU_ON determination.

In the lower part of FIG. 27, there is illustrated a range in which aSAD has been calculated first for BIO application determination in acase where the CU (PU) is 64×64 or less and VPDU=64×64 holds. In thecase of the CU (PU) of 64×64 or less, a SAD is calculated for an upperleft partial region (32×32) of the CU (PU) not partitioned and includinga vPU as a SAD calculation region for BIO_vPU_ON determination.

As described above, FIG. 26 and FIG. 27 illustrate the examples in whichwhether to apply BIO is determined in the upper-left ¼ region of eachvPU. The upper-left ¼ regions are used in consideration of compatibilitywith a case where the pipeline is structured with HW. This is becauseBIO application determination becomes possible when the L0 predictionblocks and the L1 prediction blocks in the upper-left ¼ regions areprepared.

Whether to apply BIO is determined only for the partial region of eachvPU so that the buffers that are prepared on the pipeline stages can bereduced to be smaller than the VPDU size.

Note that, the partial region has any size, and the cost (SAD)calculation can be performed for a partial region having a size of 0×0,for example. That is, 0 means that the cost is not calculated and earlytermination is skipped.

Further, the region for calculating a SAD necessary for determiningBIO_vPU_ON in each vPU can be dynamically changed.

<Operation Example of Inter Prediction Unit>

FIG. 28 and FIG. 29 are flowcharts illustrating the processing ofdetermining a partial SAD calculation region for BIO_PU_ON determinationin each vPU.

In FIG. 28 and FIG. 29, two MVs for generating an L0 prediction blockand an L1 prediction block are divided into four, namely, horizontalcomponents and vertical components, and whether the correction of BIO iseffective to a region farthest from a reference position is determinedon the assumption that such a region has inaccurate motion information.This processing is performed before Step S509 of FIG. 25, for example.In this case, the following flow is conceivable: in Step S509, it isdetermined whether or not the vPU number corresponds to an installedregion, and the processing in Steps S510 and S511 is performed only onthe set region.

In Step S601, the inter prediction control unit 201 acquires MVL0 x andMVL0 y of L0 prediction and MVL1 x and MVL0 y of L1 prediction.

In Step S602, the inter prediction control unit 201 selects one of thefour MVs that has the maximum absolute value and substitutes the MV intoMV_MAX.

In Step S603, the inter prediction control unit 201 determines whetheror not |MV_MAX|<MV_threshold holds.

In a case where it is determined in Step S603 that |MV_MAX|<MV_thresholdholds, the processing proceeds to Step S604.

In Step S604, the inter prediction control unit 201 sets the centralpart of the vPU as a SAD calculation region.

In Step S605, the inter prediction control unit 201 determines whetheror not PU size<vPU size holds.

In a case where it is determined in Step S605 that PU size<vPU sizeholds, the processing proceeds to Step S606.

In Step S606, the inter prediction control unit 201 determines thathorizontal size=horizontal PU size/2 and vertical size=vertical PUsize/2 hold.

In a case where it is determined in Step S605 that PU size<vPU size doesnot hold, the processing proceeds to Step S607.

In Step S607, the inter prediction control unit 201 determines thathorizontal size=horizontal vPU size/2 and vertical size=vertical vPUsize/2 hold.

Meanwhile, in a case where it is determined in Step S603 that|MV_MAX|<MV_threshold does not hold, the processing proceeds to StepS608.

In Step S608, the inter prediction control unit 201 determines whetheror not MV_MAX==MVL0 x∥MV_MAX==MVL1 x holds.

In a case where it is determined in Step S608 that MV_MAX==MVL0x∥MV_MAX==MVL1 x holds, the processing proceeds to Step S609.

In Step S609, the inter prediction control unit 201 determines whetheror not MV_MAX is smaller than 0.

In a case where it is determined in Step S609 that MV_MAX is smallerthan 0, the processing proceeds to Step S610.

In Step S610, the inter prediction control unit 201 sets the left partof the vPU as the SAD calculation region.

In a case where it is determined in Step S609 that MV_MAX is equal to orlarger than 0, the processing proceeds to Step S611.

In Step S611, the inter prediction control unit 201 sets the right partof the vPU as the SAD calculation region.

After Step S610 or S611, the processing proceeds to Step S612.

In Step S612, the inter prediction control unit 201 determines whetheror not PU size<vPU size holds.

In a case where it is determined in Step S612 that PU size<vPU sizeholds, the processing proceeds to Step S613.

In Step S613, the inter prediction control unit 201 determines thathorizontal size=horizontal PU size/4 and vertical size=vertical PU sizehold.

In a case where it is determined in Step S612 that PU size<vPU size doesnot hold, the processing proceeds to Step S614.

In Step S614, the inter prediction control unit 201 determines thathorizontal size=horizontal vPU size/4 and vertical size=vertical vPUsize hold.

Further, in a case where it is determined in Step S608 that MV_MAX==MVL0x∥MV_MAX==MVL1 x does not hold, the processing proceeds to Step S615.

In Step S615, the inter prediction control unit 201 determines whetheror not MV_MAX<0 holds.

In a case where it is determined in Step S615 that MV_MAX<0 holds, theprocessing proceeds to Step S616.

In Step S616, the inter prediction control unit 201 sets the upper partof the vPU as the SAD calculation region.

In a case where it is determined in Step S615 that MV_MAX<0 does nothold, the processing proceeds to Step S617.

In Step S617, the inter prediction control unit 201 sets the lower partof the vPU as the SAD calculation region.

After Step S616 or S617, the processing proceeds to Step S618.

In Step S618, the inter prediction control unit 201 determines whetheror not PU size<vPU size holds.

In a case where it is determined in Step S618 that PU size<vPU sizeholds, the processing proceeds to Step S619.

In Step S619, the inter prediction control unit 201 determines thathorizontal size=horizontal PU size and vertical size=vertical PU size/4hold.

In a case where it is determined in Step S618 that PU size<vPU size doesnot hold, the processing proceeds to Step S620.

In Step S620, the inter prediction control unit 201 determines thathorizontal size=horizontal vPU size and vertical size=vertical vPUsize/4 hold.

After Step S606, Step S607, Step S613, Step S614, Step S619, and StepS620, the processing proceeds to Step S621 of FIG. 29.

In Step S621, the inter prediction control unit 201 determines whetheror not horizontal size<4 holds.

In a case where it is determined in Step S621 that horizontal size<4holds, the processing proceeds to Step S622.

In Step S622, the inter prediction control unit 201 determines thathorizontal size=4 holds, and the processing proceeds to Step S623.

In a case where it is determined in Step S621 that horizontal size<4does not hold, the processing skips Step S622 and proceeds to Step S623.

In Step S623, the inter prediction control unit 201 determines whetheror not vertical size<4 holds.

In a case where it is determined in Step S623 that vertical size<4holds, the processing proceeds to Step S624.

In Step S624, the inter prediction control unit 201 determines thatvertical size=4 holds, and the processing of determining a partial SADcalculation region for BIO_vPU_ON determination ends.

In a case where it is determined in Step S623 that vertical size<4 doesnot hold, the processing skips Step S624, and the processing ofdetermining a partial SAD calculation region for BIO_vPU_ONdetermination ends.

The processing of calculating SADs for partial regions to determinewhether to apply BIO as described can also be applied to FRUC and DMVR.However, in FRUC and DMVR, the calculation of SADs or similar costs andthe determination thereafter, which are used for early termination inBIO, are directly reflected in the inter prediction accuracy. Thus,there is a possibility that the price paid for the omission of costcalculation is high, and it can therefore be said that the processing ofcalculating SADs for partial regions to determine whether to apply BIOis processing unique to BIO.

<2. Second Embodiment (Exemplary Operation Sharing with Flags)>

In a second embodiment, as in the first embodiment, in a case where PUsare larger than VPDUs, the PU is virtually partitioned into vPUs, andthe processing is performed in units of vPUs.

In the second embodiment, unlike the first embodiment, 1 bit of theBIO_PU_ON flag is included in bitstreams that are transmitted/receivedbetween the encoding device 1 and the decoding device 101 so that theoperation of the encoding device 1 and the operation of the decodingdevice 101 can be shared.

<Operation Example of Inter Prediction Unit>

FIG. 30 and FIG. 31 are flowcharts illustrating, as an operation exampleaccording to the second embodiment of the present technology,BIO-included Bi prediction that is performed by the inter predictionunit 51.

In Steps S701 to S708 and Steps S715 to S728 of FIG. 30 and FIG. 31,processing basically similar to that in Steps S401 to S408 and StepsS412 to S425 of FIG. 17 and FIG. 18 is performed, and hence thedescription thereof, which is redundant, is appropriately omitted.

In Step S708 of FIG. 30, the BIO cost calculation unit 204 calculates,in units of 4×4 in the vPU, the SAD of the L0 prediction image suppliedfrom the L0 prediction block generation unit 202 and the L1 predictionimage supplied from the L1 prediction block generation unit 203. TheSADs calculated in units of 4×4 are accumulated so that SAD 4×4 blockthat is the sum of the SADs is acquired.

In Step S709, the inter prediction control unit 201 determines whetheror not the number of vPUs is 1.

In a case where it is determined in Step S709 that the number of vPUs is1, the processing proceeds to Step S710. In Steps S710 and S711,processing similar to the processing that is performed in units of PUsis performed.

In Step S710, the BIO cost calculation unit 204 calculates, in units ofvPUs, the SAD of the L0 prediction image supplied from the L0 predictionblock generation unit 202 and the L1 prediction image supplied from theL1 prediction block generation unit 203. The SADs calculated in units ofvPUs are accumulated so that SAD PU that is the sum of the SADs isacquired. The acquired SAD PU is supplied from the BIO cost calculationunit 204 to the BIO application determination unit 205.

In Step S711, the BIO application determination unit 205 determines theBIO_PU_ON flag on the basis of SAD PU>=BIO threshold_PU. SAD_PU issupplied from the BIO cost calculation unit 204 and BIO threshold PU issupplied from the inter prediction control unit 201. After that, theprocessing proceeds to Step S714.

In a case where it is determined in Step S709 that the vPU number is not1, the processing proceeds to Step S712.

In Step S712, the inter prediction control unit 201 determines whetheror not the vPU number is 0.

In a case where it is determined in Step S709 that the vPU number is 0,the processing proceeds to Step S713.

In Step S713, the inter prediction control unit 201 sets BIO_PU_ON. Inthe case of the encoding device 1, BIO_PU_ON determined from a motionestimation (ME) result, for example, is set. In the case of the decodingdevice 101, BIO_PU_ON acquired from the stream is set.

In a case where it is determined in Step S712 that the vPU number is not0, the processing skips Step S713 and proceeds to Step S714 of FIG. 31.

In Step S714, it is determined whether or not the BIO_PU_ON flag is 1.

In a case where it is determined in Step S714 that the BIO_PU_ON flag isnot 1, the processing proceeds to Step S715 since BIO is not effectiveto the entire PU.

In Step S715, the Bi prediction block generation unit 206 generates a Biprediction block vPU from the L0 prediction image supplied from the L0prediction block generation unit 202 and the L1 prediction imagesupplied from the L1 prediction block generation unit 203. The generatedBi prediction block vPU is stored in the buffer and supplied from the Biprediction block generation unit 206 to the Bi prediction blockselection unit 208.

Meanwhile, in a case where it is determined in Step S714 that theBIO_PU_ON flag is 1, the processing proceeds to Step S716.

In Step S716, the BIO processing-included Bi prediction block generationunit 207 calculates a plurality of gradients from the L0 predictionblock supplied from the L0 prediction block generation unit 202 and theL1 prediction block supplied from the L1 prediction block generationunit 203.

As described above, when the BIO_PU_ON flag is included in bitstreams,the operation of the encoding device 1 and the operation of the decodingdevice 101 can be shared.

Note that, a deterioration in encoding efficiency due to the inclusionof the flag in bitstreams is concerned, and hence the BIO_PU_ON flag isnot included in all the layers, but is included only in a case where PUsare larger than VPDUs so that the value of 1 bit is relatively small. Ina case where PUs are not larger than VPDUs, as illustrated in Steps S709to S713 of FIG. 30, SAD values are calculated in units of PUs andwhether to apply BIO is determined as in the first embodiment.

In a case where the BIO_PU_ON flag is included in bitstreams, theencoding device 1 may freely set 0 or 1 to the BIO_PU_ON flag. When theencoding device 1 is a sufficiently high-performance device, adetermination method in which motion compensation is performed withBIO_PU_ON flags of 0 and 1, and one of the BIO_PU_ON flags that providesa favorable result is determined may be employed. Further, adetermination method in which the BIO_PU_ON flag is set to 0 when the PUsize is 128×128, and is otherwise set to 1 may be employed.

Meanwhile, in the decoding device 101, the BIO_PU_ON flag is decoded onthe PU layer of the CU in the Bi prediction mode in which the PUs arelarger than the VPDUs so that, when the vPU number is 0, the BIO_PU_ONflag is acquired in Step S713, and the processing proceeds. In the vPUshaving the vPU numbers of 1 or larger, in which the BIO_PU_ON flags havealready been set, the processing skips Step S713 and proceeds from StepS712 to Step S714.

A method similar to the second embodiment described above is applicableto FRUC and DMVR, but the application of the second embodiment to FRUCor DMVR is mostly pointless. This is because data for MV correction isincluded in bitstreams substantially means that difference MVs (MVDs)are encoded.

<3. Third Embodiment (Exemplary Partition with sPUs)>

In a third embodiment, a virtual partition size is different from thatof the first embodiment. In a case where PUs are larger than VPDUs, thePU is virtually partitioned into sPUs, and the processing is performedin units of sPUs.

That is, since a unit of the processing of calculating SADs to determinewhether to apply BIO is any unit that does not cross over VPDUboundaries and is equal to or smaller than the VPDU size, in the thirdembodiment, a PU is virtually partitioned into plurality of sPUs withseparately given information, and whether to apply BIO is determined foreach sPU.

To give the information, a variable such as BIO MAX SAD BLOCK SIZE isadded to and included in bitstreams to be shared by the encoding device1 and the decoding device 101.

FIG. 32 is a diagram illustrating the correspondence betweenBIO_MAX_SAD_BLOCK_SIZE and sPU.

In a case where BIO_MAX_SAD_BLOCK_SIZE is 1, the sPU size is 8×8. In acase where BIO_MAX_SAD_BLOCK_SIZE is 2, the sPU size is 16×16. In a casewhere BIO_MAX_SAD_BLOCK_SIZE is 3, the sPU size is 32×32. In a casewhere BIO_MAX_SAD_BLOCK_SIZE is 4, the sPU size is 64×64.

The value of BIO_MAX_SAD_BLOCK_SIZE may be set to any value based on theperformance of the encoding device 1, or may be determined in advance asa profile/level constraint serving as a standard. There is a levelconstraint that sets BIO_MAX_SAD_BLOCK_SIZE depending on picture sizesto be handled, that is, sets BIO_MAX_SAD_BLOCK_SIZE to 0 for SD or less,1 for HD, 2 for 4K, and 3 for 8K, for example.

<Operation Example of Inter Prediction Unit>

FIG. 33 and FIG. 34 are flowcharts illustrating, as an operation exampleaccording to the third embodiment of the present technology,BIO-included Bi prediction that is performed by the inter predictionunit 51.

Note that, in Steps S801 to S825 of FIG. 33 and FIG. 34, processingbasically similar to that in Steps S401 to S425 of FIG. 17 and FIG. 18is performed although the vPU is replaced by the sPU different from thevPU in size, and hence the description thereof, which is redundant, isappropriately omitted.

FIG. 35 and FIG. 36 are diagrams illustrating exemplary regions forcalculating SADs in each PU in a case where BIO_MAX_SAD_BLOCK_SIZE is 2.

In the upper part of FIG. 35, there are illustrated regions forcalculating SADs for sPUs in a case where the CU (PU) is 128×128,VPDU=64×64 holds, and BIO_MAX_SAD_BLOCK_SIZE is 2 (sPU=32×=). In thecase of the upper part of FIG. 35, the PU is partitioned into 16 sPUsthat do not cross over the VPDU boundaries.

In the lower part of FIG. 35, there are illustrated regions forcalculating SADs for sPUs in a case where the CU (PU) is 128×64,VPDU=64×64 holds, and BIO MAX SAD BLOCK SIZE is 2 (sPU=32×=). In thecase of the lower part of FIG. 35, the PU is partitioned into eight sPUsthat do not cross over the VPDU boundaries.

In the upper part of FIG. 36, there are illustrated regions forcalculating SADs for sPUs in a case where the CU (PU) is 64×128,VPDU=64×64 holds, and BIO_MAX_SAD_BLOCK_SIZE is 2 (sPU=32×=). In thecase of the upper part of FIG. 36, the PU is partitioned into eight sPUsthat do not cross over the VPDU boundaries.

In the lower part of FIG. 36, there are illustrated regions forcalculating SADs for sPUs in a case where the CU (PU) is 64×64 or less,VPDU=64×64 holds, and BIO_MAX_SAD_BLOCK_SIZE is 2 (sPU=32×=). In thecase of the upper part of FIG. 36, the PU is partitioned into four sPUsthat do not cross over the VPDU boundaries.

As described above, in the third embodiment of the present technology, aPU is virtually partitioned into a plurality of sPUs with separatelygiven information, and whether to apply BIO is determined for each sPU.With this, the buffer size can be further reduced as compared to thebuffer size in the case of by using vPUs.

<4. Fourth Embodiment (Example in which Use of BIO is Prohibited)>

In a fourth embodiment, in a case where PUs are larger than VPDUs, theuse of BIO is constrained. With this, the buffer size can be reduced.

<Operation Example of Inter Prediction Unit>

FIG. 37 and FIG. 38 are flowcharts illustrating, as an operation exampleaccording to the fourth embodiment of the present technology,BIO-included Bi prediction that is performed by the inter predictionunit 51.

In Steps S901 to S907 and S926 of FIG. 37 and FIG. 38, processingbasically similar to that in Steps S401 to S407 and S425 of FIG. 17 andFIG. 18 is performed, and hence the description thereof, which isredundant, is appropriately omitted. Further, in Steps S909 to S925 ofFIG. 37 and FIG. 38, processing basically similar to that in Steps S304to S320 of FIG. 15 and FIG. 16 is performed, and hence the descriptionthereof, which is redundant, is appropriately omitted.

In Step S907, the L1 prediction block generation unit 203 generates anL1 prediction block in the region of the vPU number.

In Step S908, the inter prediction control unit 201 determines whetheror not 1<the number of vPUs holds.

In a case where it is determined in Step S908 that 1<the number of vPUsdoes not hold, the processing proceeds to Step S909. In a case where thenumber of vPUs is 1, that is, vPU=PU holds, the processing subsequent toStep S909 is similar to the processing subsequent to Step S309 of FIG.15.

In a case where it is determined in Step S908 that 1<the number of vPUsholds, the processing proceeds to Step S913 of FIG. 38.

Further, in a case where it is determined in Step S912 that theBIO_vPU_ON flag is not 1, the processing proceeds to Step S913 since BIOis not effective to the entire vPU.

In Step S913, the Bi prediction block generation unit 206 generates a Biprediction block vPU from the L0 prediction image supplied from the L0prediction block generation unit 202 and the L1 prediction imagesupplied from the L1 prediction block generation unit 203. The generatedBi prediction block vPU is stored in the buffer and supplied from the Biprediction block generation unit 206 to the Bi prediction blockselection unit 208.

As described above, in FIG. 37 and FIG. 38, between Step S907 to S913,Step S908 is added as the conditional branch step for determiningwhether or not there are a plurality of vPUs, that is, whether or not aPU is larger than a VPDU.

In a case where the PU is larger than the VPDU, the processing proceedsfrom Step S908 to normal Bi prediction in Step S913 in which BIO is notused and SAD value calculation for the entire PU is unnecessary, andhence, as in FIG. 4, the PU can be partitioned into virtual vPUs to beprocessed.

The processing in Steps S909 to S925, which come after the branch inStep S908, is similar to that in the related-art BIO-included Biprediction (S304 to S320 of FIG. 15 and FIG. 16). However, theprocessing proceeds to Step S909 in a case where the PU is equal to orsmaller than the VPDU, and hence SAD calculation for the entire PU onlyuses a resource equal to or smaller than the VPDU.

<5. Fifth Embodiment (Example in which BIO is Always Applied)>

In a fifth embodiment, in a case where PUs are larger than VPDUs, BIO isalways applied so that the buffer size is reduced.

<Operation Example of Inter Prediction Unit>

FIG. 39 and FIG. 40 are flowcharts illustrating, as an operation exampleaccording to the fifth embodiment of the present technology,BIO-included Bi prediction that is performed by the inter predictionunit 51.

In Steps S1001 to S1008 and S1026 of FIG. 39 and FIG. 40, processingbasically similar to that in Steps S401 to S408 and S425 of FIG. 17 andFIG. 18 is performed, and hence the description thereof, which isredundant, is appropriately omitted. Further, in Steps S1014 to S1025 ofFIG. 39 and FIG. 40, processing basically similar to that in Steps S309to S320 of FIG. 15 and FIG. 16 is performed, and hence the descriptionthereof, which is redundant, is appropriately omitted.

In Step S1008, the BIO cost calculation unit 204 calculates, in units of4×4 in the vPU, the SAD of the L0 prediction image supplied from the L0prediction block generation unit 202 and the L1 prediction imagesupplied from the L1 prediction block generation unit 203. The SADscalculated in units of 4×4 are accumulated so that SAD 4×4 block that isthe sum of the SADs is acquired.

In Step S1009, the inter prediction control unit 201 determines whetheror not 1<the number of vPUs holds.

In a case where it is determined in Step S1009 that 1<the number of vPUsdoes not hold, the processing proceeds to Step S1010.

In Step S1010, the BIO cost calculation unit 204 calculates, in units ofPUs, the SAD of the L0 prediction image supplied from the L0 predictionblock generation unit 202 and the L1 prediction image supplied from theL1 prediction block generation unit 203. The SADs calculated in units ofPUs are accumulated so that SAD PU that is the sum of the SADs isacquired. The acquired SAD PU is supplied from the BIO cost calculationunit 204 to the BIO application determination unit 205.

In Step S1011, the BIO application determination unit 205 determines theBIO_PU_ON flag on the basis of SAD_PU>=BIO threshold_PU. SAD_PU issupplied from the BIO cost calculation unit 204 and BIO threshold PU issupplied from the inter prediction control unit 201.

In Step S1012, it is determined whether or not the BIO_PU_ON flag is 1.

In a case where it is determined in Step S1012 that the BIO_PU_ON flagis not 1, the processing proceeds to Step S1013 of FIG. 40 since BIO isnot effective to the entire vPU.

In Step S1013, the Bi prediction block generation unit 206 generates aBi prediction block vPU from the L0 prediction image supplied from theL0 prediction block generation unit 202 and the L1 prediction imagesupplied from the L1 prediction block generation unit 203. The generatedBi prediction block vPU is stored in the buffer and supplied from the Biprediction block generation unit 206 to the Bi prediction blockselection unit 208.

In a case where it is determined in Step S1012 that the BIO_PU_ON flagis 1, the processing proceeds to Step S1014 of FIG. 40.

Further, in a case where it is determined in Step S1009 that 1<thenumber of vPUs holds, the processing proceeds to Step S1014.

In Step S1014 and the later steps, BIO processing similar to that inSteps S309 to S320 of FIG. 15 is performed.

As described above, in FIG. 39 and FIG. 40, in Step S1009, theconditional branch for determining whether or not there are a pluralityof vPUs, that is, whether or not a PU is larger than a VPDU is added.

In a case where the PU is larger than the VPDU, the processing bypassesthe SAD calculation to the threshold determination in S1010 to S1012 toproceed to the BIO application processing in Step S1014 and the latersteps so that SAD calculation for the entire PU is not necessary, andhence, as in FIG. 4, the PU can be partitioned into virtual vPUs to beprocessed.

The processing proceeds to Step S1010 to S1012 in a case where the PU isequal to or smaller than the VPDU, and hence SAD calculation for theentire PU only uses a resource equal to or smaller than the VPDU.

Note that, the fifth embodiment is not applicable to FRUC and DMVRbecause of the following reason. Since SAD calculation in BIO is for thepurpose of early termination, the cost calculation can be avoided withanother criterion such as the PU size as in the fifth embodiment. Costcalculation in FRUC and DMVR is, however, key processing in MVcorrection, and is difficult to avoid.

As described above, according to the present technology, a unit ofprocessing in calculation of a cost that is used for determining whetherto perform bidirectional prediction such as BIO or not is partitionedinto partitioned processing units each of which corresponds to the VPDUsize (for example, vPU) or is equal to or smaller than the VPDU size(for example, sPU), and the determination is made by using the costcalculated on the basis of the partitioned processing units. With this,the buffer size can be reduced.

VVC can be implemented with BIO so that the necessary sizes of thevarious buffers can be reduced to ¼ of the related-art buffer sizes.

Further, the HW configuration can be optimized so that BIO can beimplemented with the buffers, some of which have sizes greatly smallerthan ¼ of the related-art sizes.

<6. Sixth Embodiment (Computer)> <Configuration Example of Computer>

The series of processing processes described above can be executed byhardware or software. In a case where the series of processing processesis executed by software, a program configuring the software is installedon a computer incorporated in dedicated hardware or a general-purposepersonal computer from a program recording medium.

FIG. 41 is a block diagram illustrating a configuration example of thehardware of a computer configured to execute the above-mentioned seriesof processing processes with the program.

A CPU (Central Processing Unit) 301, a ROM (Read Only Memory) 302, and aRAM (Random Access Memory) 303 are connected to each other through a bus304.

An input/output interface 305 is further connected to the bus 304. Theinput/output interface 305 is connected to an input unit 306 including akeyboard, a mouse, or the like and an output unit 307 including adisplay, a speaker, or the like. Further, the input/output interface 305is connected to a storage unit 308 including a hard disk, a non-volatilememory, or the like, a communication unit 309 including a networkinterface or the like, and a drive 310 configured to drive a removablemedium 311.

In the computer configured as described above, for example, the CPU 301loads the program stored in the storage unit 308 into the RAM 303through the input/output interface 305 and the bus 304 and executes theprogram to perform the series of processing processes described above.

The program that is executed by the CPU 301 can be recorded on theremovable medium 311 to be installed on the storage unit 308, forexample. Alternatively, the program can be provided via a wired orwireless transmission medium such as a local area network, the Internet,or digital satellite broadcasting to be installed on the storage unit308.

Note that, as for the program that is executed by the computer, theprocessing processes of the program may be performed chronologically inthe order described herein or in parallel. Alternatively, the processingprocesses may be performed at appropriate timings, for example, when theprogram is called.

Note that, a system herein means a set of plural components (devices,modules (parts), or the like), and it does not matter whether or not allthe components are in the same housing. Thus, plural devices that areaccommodated in separate housings and connected to each other via anetwork, and a single device in which plural modules are accommodated ina single housing are both systems.

Note that, the effects described herein are only exemplary and notlimited, and other effects may be provided.

The embodiment of the present technology is not limited to theembodiments described above, and various modifications can be madewithout departing from the gist of the present technology.

For example, the present technology can be implemented as cloudcomputing in which a single function is shared and processed by pluraldevices via a network.

Further, the steps of the flowcharts described above may be executed bya single device or shared and executed by plural devices.

Further, in a case where plural processing processes are included in asingle step, the plural processing processes included in the single stepcan be executed by a single device or shared and executed by pluraldevices.

<Combination Examples of Configurations>

The present technology can also take the following configurations.

-   (1)

An image processing device including:

a control unit configured to partition a unit of processing intopartitioned processing units each of which corresponds to a VPDU size oris equal to or smaller than the VPDU size, the unit of processing beingused for calculation of a cost that is used for determining whether ornot to perform bidirectional prediction; and

a determination unit configured to make the determination by using thecost calculated based on the partitioned processing units.

-   (2)

The image processing device according to Item (1), in which

the determination unit makes the determination by using the costcalculated by each of the partitioned processing units.

-   (3)

The image processing device according to Item (1), in which

the determination unit makes, by using the cost calculated for a firstone of the partitioned processing units, the determination on the firstone of the partitioned processing units, and makes the determination onanother of the partitioned processing units by using a result of thedetermination on the first one of the partitioned processing units.

-   (4)

The image processing device according to Item (1), in which

the determination unit makes the determination by each of thepartitioned processing units by using the cost calculated for each ofpartial regions in the partitioned processing units.

-   (5)

The image processing device according to Item (1), in which

the determination unit makes the determination by each of thepartitioned processing units based on a flag set to each of thepartitioned processing units, the flag indicating whether or not toperform the bidirectional prediction.

-   (6)

The image processing device according to any one of Items (1) to (5), inwhich

the bidirectional prediction includes the bidirectional predictionemploying BIO.

-   (7)

The image processing device according to Item (1) or (2), in which

the bidirectional prediction includes the bidirectional predictionemploying FRUC or DMVR.

-   (8)

An image processing method for causing an image processing device to:

partition a unit of processing into partitioned processing units each ofwhich corresponds to a VPDU size or is equal to or smaller than the VPDUsize, the unit of processing being used for calculation of a cost thatis used for determining whether or not to perform bidirectionalprediction; and

make the determination by using the cost calculated based on thepartitioned processing units.

REFERENCE SIGNS LIST

1: Encoding device

36: Lossless encoding unit

47: Motion prediction/compensation unit

51: Inter prediction unit

101: Decoding device

132: Lossless decoding unit

201: Inter prediction control unit

202: L0 prediction block generation unit

203: L1 prediction block generation unit

204: BIO cost calculation unit

205: BIO application determination unit

206: Bi prediction block generation unit

207: BIO processing-included Bi prediction block generation unit

208: Bi prediction block selection unit

209: Prediction block selection unit

1. An image processing device comprising: a control unit configured topartition a unit of processing into partitioned processing units each ofwhich corresponds to a VPDU size or is equal to or smaller than the VPDUsize, the unit of processing being used for calculation of a cost thatis used for determining whether or not to perform bidirectionalprediction; and a determination unit configured to make thedetermination by using the cost calculated based on the partitionedprocessing units.
 2. The image processing device according to claim 1,wherein the determination unit makes the determination by using the costcalculated by each of the partitioned processing units.
 3. The imageprocessing device according to claim 1, wherein the determination unitmakes, by using the cost calculated for a first one of the partitionedprocessing units, the determination on the first one of the partitionedprocessing units, and makes the determination on another of thepartitioned processing units by using a result of the determination onthe first one of the partitioned processing units.
 4. The imageprocessing device according to claim 1, wherein the determination unitmakes the determination by each of the partitioned processing units byusing the cost calculated for each of partial regions in the partitionedprocessing units.
 5. The image processing device according to claim 1,wherein the determination unit makes the determination by each of thepartitioned processing units based on a flag set to each of thepartitioned processing units, the flag indicating whether or not toperform the bidirectional prediction.
 6. The image processing deviceaccording to claim 1, wherein the bidirectional prediction includes thebidirectional prediction employing BIO.
 7. The image processing deviceaccording to claim 1, wherein the bidirectional prediction includes thebidirectional prediction employing FRUC or DMVR.
 8. An image processingmethod for causing an image processing device to: partition a unit ofprocessing into partitioned processing units each of which correspondsto a VPDU size or is equal to or smaller than the VPDU size, the unit ofprocessing being used for calculation of a cost that is used fordetermining whether or not to perform bidirectional prediction; and makethe determination by using the cost calculated based on the partitionedprocessing units.